This patch completes the code move out of ice_main.c
The following top level functions and related dependency functions) were
moved to ice_lib.c:
ice_vsi_setup
ice_vsi_cfg_tc
The following functions were made static again:
ice_vsi_setup_vector_base
ice_vsi_alloc_q_vectors
ice_vsi_get_qs
void ice_vsi_map_rings_to_vectors
ice_vsi_alloc_rings
ice_vsi_set_rss_params
ice_vsi_set_num_qs
ice_get_free_slot
ice_vsi_init
ice_vsi_alloc_arrays
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch continues the code move out of ice_main.c
The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_setup_vector_base
ice_vsi_alloc_q_vectors
ice_vsi_get_qs
The following functions were made static again:
ice_vsi_free_arrays
ice_vsi_clear_rings
Also, in this patch, the netdev and NAPI registration logic was de-coupled
from the VSI creation logic (ice_vsi_setup) as for SR-IOV, while we want to
create VF VSIs using ice_vsi_setup, we don't want to create netdevs.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch continues the code move out of ice_main.c
The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_clear
ice_vsi_close
ice_vsi_free_arrays
ice_vsi_map_rings_to_vectors
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch continues the code move out of ice_main.c
The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_alloc_rings
ice_vsi_set_rss_params
ice_vsi_set_num_qs
ice_get_free_slot
ice_vsi_init
ice_vsi_clear_rings
ice_vsi_alloc_arrays
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch continues the code move out of ice_main.c
The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_delete
ice_free_res
ice_get_res
ice_is_reset_recovery_pending
ice_vsi_put_qs
ice_vsi_dis_irq
ice_vsi_free_irq
ice_vsi_free_rx_rings
ice_vsi_free_tx_rings
ice_msix_clean_rings
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch continues the code move out of ice_main.c
The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_start_rx_rings
ice_vsi_stop_rx_rings
ice_vsi_stop_tx_rings
ice_vsi_cfg_rxqs
ice_vsi_cfg_txqs
ice_vsi_cfg_msix
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The functions that are used for PF VSI/netdev setup will also be used
for SR-IOV support. To allow reuse of these functions, move these
functions out of ice_main.c to ice_common.c/ice_lib.c
This move is done across multiple patches. Each patch moves a few
functions and may have minor adjustments. For example, a function that was
previously static in ice_main.c will be made non-static temporarily in
its new location to allow the driver to build cleanly. These adjustments
will be removed in subsequent patches where more code is moved out of
ice_main.c
In this particular patch, the following functions were moved out of
ice_main.c:
int ice_add_mac_to_list
ice_free_fltr_list
ice_stat_update40
ice_stat_update32
ice_update_eth_stats
ice_vsi_add_vlan
ice_vsi_kill_vlan
ice_vsi_manage_vlan_insertion
ice_vsi_manage_vlan_stripping
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The system image guid is a read-only field which is used by the TC
offloads code to determine if two mlx5 devices belong to the same
ASIC while adding flows.
Read this once and save it on the core device rather than querying each
time an offloaded flow is added.
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently we practically never report checksum unnecessary, because
for all IP packets we take the checksum complete path.
Enable non-default runs with reprorting checksum unnecessary, using
an ethtool private flag. This can be useful for performance evals
and other explorations.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
We can report checksum unnecessary also when the L3 checksum
flag on the cqe is set and there's no L4 header.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Increased the amount of channels the representors can open to be the
amount of CPUs. The default amount opened remains one.
Used the standard NIC netdev functions to:
* Set RSS params when building the representors' params.
* Setup an indirect TIR and RQT for the representors upon
initialization.
* Create a TTC flow table for the representors' indirect TIR (when
creating the TTC table, mlx5e_set_ttc_basic_params() is not called,
in order to avoid setting the inner_ttc param, which is not needed).
Added ethtool control to the representors for setting and querying
the amount of open channels. Additionally, included logic in the
representors' ethtool set channels handler which controls a
representor's vport rx rule, so that if there is one open channel
the rx rule steers traffic to the representor's direct TIR, whereas
if there is more than one channel, the rx rule steers traffic to the
new TTC flow table.
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Towards enabling RSS for the vport representors, expose the functions for
querying the rss hash key size and indirection table size via ethtool.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Towards enabling RSS for the vport representors, extract the
procedure for building a device's RSS params, and expose the
function.
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Change the driver functions that deal with creating indirect tirs
to get a flag telling if inner ttc is desired.
A pre-step for enabling rss on the vport representors, where
inner ttc is not needed.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently the destination for the representor e-switch rx rule is
a TIR number. Towards changing that to potentially be a flow table,
as part of enabling RSS for representors, modify the signature of
the related e-switch API to get a flow destination.
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Cleaning up the flow of the representors' rx initialization, towards
enabling RSS for the representors.
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Enabled checksum and TSO offloads for the representors, in
order to increase their performance, which is required to
increase the performance of flows that cannot be offloaded.
Checksum offloads contribute to a general acceleration of all
traffic (to around 150%), whereas the TSO offload contributes
to a prominent acceleration of the representor's TX for traffic
flows with larger than MTU sized packets (to around 200%). This
is the usual case for TCP streams, as the PF, which serves as
the uplink representor, and the VF representors employ GRO before
forwarding the packets to the representor.
GRO was enabled implicitly for the representors beforehand, and
is explicitly enabled here to ensure that the representors preserve
the performance boost it provides (of around 200%) when working in
tandem with the TSO offload by the forwardee, which is the standard
case as both the PF and the VF representors employ HW TSO.
The impact of these changes can be seen in the following
measurements taken on a setup of a VM over a VF, connected
to OVS via the VF representor, to an external host:
Before current changes:
TCP Throughput [Gb/s]
External host to VM ~ 10.5
VM to external host ~ 23.5
With just checksum offloads enabled:
TCP Throughput [Gb/s]
External host to VM ~ 14.9
VM to external host ~ 28.5
With the TSO offload also enabled:
TCP Throughput [Gb/s]
External host to VM ~ 30.5
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The representors' RQ size was not large enough for them to achieve
high enough performance, and therefore needed to be enlarged, while
suffering a minimum hit to its memory usage. To achieve this the
representors RQ size was increased, and its type was changed to be a
striding RQ if it is supported.
Towards that goal the following changes were made:
* Extracted the sequence for setting the standard netdev's RQ parmas
into a function
* Replaced the sequence for setting the representor's RQ params with
the standard sequence
The impact of this change can be seen in the following measurements
taken on a setup of a VM over a VF, connected to OVS via the VF
representor, to an external host:
Before current change:
TCP Throughput [Gb/s]
VM to external host ~ 7.2
With the current change (measured with a striding RQ):
TCP Throughput [Gb/s]
VM to external host ~ 23.5
Each representor now consumes 2 [MB] of memory for its packet
buffers.
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Allow using partial masks for L3 addresses and L4 ports across
the place.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* TKIP implementation in new devices;
* Fix for the shared antenna setting in 22000 series;
* Report that we set the RU offset in HE code;
* Fix some register addresses in 22000 series;
* Fix one FW feature TLV that had a conflict with another value;
* A couple of fixes for SoftAP mode;
* Work continues for new 22560 hardware;
* Some fixes in the datapath;
* Some debugging and other general fixes;
* Some cleanups, small improvements and other general fixes;
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEF3LNfgb2BPWm68smoUecoho8xfoFAlutzKgACgkQoUecoho8
xfpv7g//fhLx4xvYl2i5sVhk+FcYCEUcyYiVO4wrXdmdNafobuJRaMuxuUuagExE
J+wsQmSt3LribOGQB06aa/Lf+d5FyKpD8Qcs3ZY2WX5OlLRN1EczNoiTXKfE0E1D
d0a80IjD2EeqhhU9D/7DTBsN7zCpJEW5otJ4S9WY0Y/MKHSFyiDcoqnx4H1ZAv5N
WH8cvTjGf4tPjkjuuEPLlVhz65hqNsM1A+VaZCU21SOlc8ihSXSAt1h8AMnWLPLz
MHxzMbnjWPN8qjgKacEy7ETP14iCjTryRsXBWt48A+XZYyUQFNcnjFVME5KyOB6V
YkHb8EQSkjHOWg4eutOJijNPBHLxQDFHY6LdOZ3JEmqtOKPt+A82JwXmBq6Ez4O4
DobrAEvDwnBMFhNoboNA9C0/B57j9+FPkgd0a8Y98Rr28SBYbnmK8wlZVRWU0S/9
WRg0pafzmCh1hcYedLWfGFkNio9ZZqfhLOWdVpaobJkE47gDLx3aoyXx7UK4VSfG
kIigRccQsOYfOUNvQM8f/J2/uzy1TfkM7E4PaU6Q6lIhfjDuWPzGY7SpuwpSd1hf
qu1EB4EwRiOdvwc/EiLJpsrMSNe9m1EWIopfas0x1dallOSC1bNIRTh2o4oN3a3V
vcvtTIwST5C5IvMeZOFR2xmqArypWlEjvE2ieuq8RtMZnMaq1Rk=
=5Y2V
-----END PGP SIGNATURE-----
Merge tag 'iwlwifi-next-for-kalle-2018-09-28' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next
Second set of iwlwifi patches for 4.20
* TKIP implementation in new devices;
* Fix for the shared antenna setting in 22000 series;
* Report that we set the RU offset in HE code;
* Fix some register addresses in 22000 series;
* Fix one FW feature TLV that had a conflict with another value;
* A couple of fixes for SoftAP mode;
* Work continues for new 22560 hardware;
* Some fixes in the datapath;
* Some debugging and other general fixes;
* Some cleanups, small improvements and other general fixes;
Trivial fix to spelling mistake struct field name, rename it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Make sure that wifi device is of supported variant by checking it's CHIP ID
before completing a probe sequence.
Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Waiting for "completion" to be set in FW load thread can not be used
in case PCIe remove is called before FW load work was scheduled.
Just wait for work completion instead to avoid problems.
Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Extract platform-independent PCIe driver code into a separate file, and
use it from platform-specific modules.
Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
A few include directives were missing in bus.h resulting in dependency
of include order in other modules. Add missing includes.
Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Rename several functions to indicate that they are platform specific.
Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Move platform-independent PCIe data structure to a separate header file
so it can be reused by different devices.
Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
tx_lock name will later be reused when common pcie code is extracted to
separate files.
Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
In preparation to extract common PCIe driver state, indicate
PEARL-specific structures by their name and move them to pearl-specific
source file.
Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
In preparation to extract common pcie driver state into a separate
structure, rename Pearl-specific state to qtnf_pcie_pearl_state and move
it directly to pearl-specific PCIe source file.
Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
In preparation to extract common qtnfmac PCIe driver sources into a
separate file, move existing Pearl-specific pcie driver sources to pcie/
directory.
Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Firmware name is only needed at probe stage, no point in keeping it in
driver state structure.
Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Implement custom rt2800mmio flush routine and change txstatus
routine to read TX_STA_FIFO also in the tasklet.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Use different tx status timeouts for normal operation and when flushing.
This increase timeout to 2s for normal operation as when there are bad
radio conditions and frames are reposted many times device can not provide
the status for quite long. With new timeout we can still get valid status
on such bad conditions.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Do not check for tx status timeout everytime we perform txstatus tasklet.
Perform check once per half a second.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Use usb txdone/txstatus routines (now in rt2800libc) for mmio devices.
Note this also change how we handle INT_SOURCE_CSR_TX_FIFO_STATUS
interrupt. Now it is disabled since IRQ routine till end of the txstatus
tasklet (the same behaviour like others interrupts). Reason to do not
disable this interrupt was not to miss any tx status from 16 entries
FIFO register. Now, since we check for tx status timeout, we can
allow to miss some tx statuses. However this will be improved in further
patch where I also implement read status FIFO register in the tasklet.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
In order to reuse usb txdone/txstatus routines for mmio, move them
to common rt2800lib.c file.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c:1327:34:
warning: implicit conversion from enumeration type 'enum
btc_chip_interface' to different enumeration type 'enum
wifionly_chip_interface' [-Wenum-conversion]
wifionly_cfg->chip_interface = BTC_INTF_PCI;
~ ^~~~~~~~~~~~
drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c:1330:34:
warning: implicit conversion from enumeration type 'enum
btc_chip_interface' to different enumeration type 'enum
wifionly_chip_interface' [-Wenum-conversion]
wifionly_cfg->chip_interface = BTC_INTF_USB;
~ ^~~~~~~~~~~~
drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c:1333:34:
warning: implicit conversion from enumeration type 'enum
btc_chip_interface' to different enumeration type 'enum
wifionly_chip_interface' [-Wenum-conversion]
wifionly_cfg->chip_interface = BTC_INTF_UNKNOWN;
~ ^~~~~~~~~~~~~~~~
3 warnings generated.
Use the values from the correct enumerated type, wifionly_chip_interface.
BTC_INTF_UNKNOWN = WIFIONLY_INTF_UNKNOWN = 0
BTC_INTF_PCI = WIFIONLY_INTF_PCI = 1
BTC_INTF_USB = WIFIONLY_INTF_USB = 2
Link: https://github.com/ClangBuiltLinux/linux/issues/135
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Clang warns that the address of a pointer will always evaluated as true
in a boolean context:
drivers/net/wireless/ath/ath5k/debug.c:1031:14: warning: address of
array 'ah->sbands' will always evaluate to 'true'
[-Wpointer-bool-conversion]
BUG_ON(!ah->sbands);
~~~~~^~~~~~
./include/asm-generic/bug.h:61:45: note: expanded from macro 'BUG_ON'
#define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0)
^~~~~~~~~
./include/linux/compiler.h:77:42: note: expanded from macro 'unlikely'
# define unlikely(x) __builtin_expect(!!(x), 0)
^
1 warning generated.
Given that this condition is always false because of the logical not,
just remove it.
Link: https://github.com/ClangBuiltLinux/linux/issues/130
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Clang warns that the address of a pointer will always evaluated as true
in a boolean context.
drivers/net/wireless/rsi/rsi_91x_mac80211.c:927:50: warning: address of
array 'key->key' will always evaluate to 'true'
[-Wpointer-bool-conversion]
if (vif->type == NL80211_IFTYPE_STATION && key->key &&
~~ ~~~~~^~~
1 warning generated.
Link: https://github.com/ClangBuiltLinux/linux/issues/136
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Clang warns when multiple pairs of parentheses are used for a single
conditional statement.
drivers/net/wireless/intel/ipw2x00/ipw2200.c:5655:28: warning: equality
comparison with extraneous parentheses [-Wparentheses-equality]
if ((priv->ieee->iw_mode == IW_MODE_ADHOC)) {
~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
drivers/net/wireless/intel/ipw2x00/ipw2200.c:5655:28: note: remove
extraneous parentheses around the comparison to silence this warning
if ((priv->ieee->iw_mode == IW_MODE_ADHOC)) {
~ ^ ~
drivers/net/wireless/intel/ipw2x00/ipw2200.c:5655:28: note: use '=' to
turn this equality comparison into an assignment
if ((priv->ieee->iw_mode == IW_MODE_ADHOC)) {
^~
=
1 warning generated.
Link: https://github.com/ClangBuiltLinux/linux/issues/134
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
This patch fixes the bug that all datapath and vport ops are returning
wrong values (OVS_FLOW_CMD_NEW or OVS_DP_CMD_NEW) in their replies.
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Structure 'tls_rec' contains sg_aead_in and sg_aead_out which point
to a aad_space and then chain scatterlists sg_plaintext_data,
sg_encrypted_data respectively. Rather than using chained scatterlists
for plaintext and encrypted data in aead_req, it is efficient to store
aad_space in sg_encrypted_data and sg_plaintext_data itself in the
first index and get rid of sg_aead_in, sg_aead_in and further chaining.
This requires increasing size of sg_encrypted_data & sg_plaintext_data
arrarys by 1 to accommodate entry for aad_space. The code which uses
sg_encrypted_data and sg_plaintext_data has been modified to skip first
index as it points to aad_space.
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy says:
====================
tipc: make connection setup more robust
In this series we make a few improvements to the connection setup and
probing mechanism, culminating in the last commit where we make it
possible for a client socket to make multiple setup attempts in case
it encounters receive buffer overflow at the listener socket.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Default socket receive buffer size for a listener socket is 2Mb. For
each arriving empty SYN, the linux kernel allocates a 768 bytes buffer.
This means that a listener socket can serve maximum 2700 simultaneous
empty connection setup requests before it hits a receive buffer
overflow, and much fewer if the SYN is carrying any significant
amount of data.
When this happens the setup request is rejected, and the client
receives an ECONNREFUSED error.
This commit mitigates this problem by letting the client socket try to
retransmit the SYN message multiple times when it sees it rejected with
the code TIPC_ERR_OVERLOAD. Retransmission is done at random intervals
in the range of [100 ms, setup_timeout / 4], as many times as there is
room for within the setup timeout limit.
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Messages intended for intitating a connection are currently
indistinguishable from regular datagram messages. The TIPC
protocol specification defines bit 17 in word 0 as a SYN bit
to allow sanity check of such messages in the listening socket,
but this has so far never been implemented.
We do that in this commit.
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We refactor the function tipc_sk_filter_connect(), both to make it
more readable and as a preparation for the next commit.
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We refactor this function as a preparation for the coming commits in
the same series.
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function tipc_msg_reverse() is reversing the header of a message
while reusing the original buffer. We have seen at several occasions
that this may have unfortunate side effects when the buffer to be
reversed is a clone.
In one of the following commits we will again need to reverse cloned
buffers, so this is the right time to permanently eliminate this
problem. In this commit we let the said function always consume the
original buffer and replace it with a new one when applicable.
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>