Commit Graph

826587 Commits

Author SHA1 Message Date
Peng Li cc5ff6e90f net: hns3: free the pending skb when clean RX ring
If there is pending skb in RX flow when close the port, and the
pending buffer is not cleaned, the new packet will be added to
the pending skb when the port opens again, and the first new
packet has error data.

This patch cleans the pending skb when clean RX ring.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:47:35 -07:00
Jian Shen 2d0075b4a7 net: hns3: do not initialize MDIO bus when PHY is inexistent
For some cases, PHY may not be connected to MDIO bus, then
the driver will initialize fail since MDIO bus initialization
fails.

This patch fixes it by skipping the MDIO bus initialization
when PHY is inexistent.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:47:35 -07:00
Weihang Li c41e672d1e net: hns3: set dividual reset level for all RAS and MSI-X errors
According to hardware description, reset level that should be
triggered are not consistent in a module. For example, in SSU
common errors, the first two bits has no need to do reset,
but the other bits need global reset.

This patch sets separate reset level for all RAS and MSI-X
interrupts by adding a reset_lvel field in struct hclge_hw_error,
and fixes some incorrect reset level.

Signed-off-by: Weihang Li <liweihang@hisilicon.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:47:35 -07:00
Yunsheng Lin 1a49f3c614 net: hns3: divide shared buffer between TC
Currently hardware may have not enough buffer to receive packet
when it has used more than two MPS(maximum packet size) of
buffer, but there are still a lot of shared buffer left unused
when TC num is small.

This patch divides shared buffer to be used between TC when
the port supports DCB, and adjusts the waterline and threshold
according to user manual for the port that does not support
DCB.

This patch also change hclge_get_tc_num's return type to u32
to avoid signed-unsigned mix with divide.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:47:35 -07:00
Yunsheng Lin db5936db8f net: hns3: always assume no drop TC for performance reason
Currently RX shared buffer' threshold size for speific TC is
set to smaller value when the TC's PFC is not enabled, which may
cause performance problem because hardware may not have enough
hardware buffer when PFC is not enabled.

This patch sets the same threshold size for all TC no matter if
the specific TC's PFC is enabled.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:47:35 -07:00
Yunsheng Lin d474d88f88 net: hns3: add hns3_gro_complete for HW GRO process
When a GRO packet is received by driver, the cwr field in the
struct tcphdr needs to be checked to decide whether to set the
SKB_GSO_TCP_ECN for skb_shinfo(skb)->gso_type.

So this patch adds hns3_gro_complete to do that, and adds the
hns3_handle_bdinfo to handle the hns3_gro_complete and
hns3_rx_checksum.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:47:35 -07:00
Yunsheng Lin a4d2cdcbb8 net: hns3: minor refactor for hns3_rx_checksum
Change the parameters of hns3_rx_checksum to be more specific to
what is used internally, rather than passing in a pointer to the
whole hns3_desc. Reduces duplicate code and bring this function
inline with the approach used in hns3_set_gro_param.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:47:35 -07:00
Jian Shen 92f11ea177 net: hns3: fix set port based VLAN issue for VF
In original codes, ndo_set_vf_vlan() in hns3 driver was implemented
wrong. It adds or removes VLAN into VLAN filter for VF, but VF is
unaware of it.

This patch fixes it. When VF loads up, it firstly queries the port
based VLAN state from PF. When user change port based VLAN state
from PF, PF firstly checks whether the VF is alive. If the VF is
alive, then PF notifies the VF the modification; otherwise PF
configure the port based VLAN state directly.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:47:35 -07:00
Jian Shen 21e043cd81 net: hns3: fix set port based VLAN for PF
In original codes, ndo_set_vf_vlan() in hns3 driver was implemented
wrong. It adds or removes VLAN into VLAN filter for VF, but VF is
unaware of it.

Indeed, ndo_set_vf_vlan() is expected to enable or disable port based
VLAN (hardware inserts a specified VLAN tag to all TX packets for a
specified VF) . When enable port based VLAN, we use port based VLAN id
as VLAN filter entry. When disable port based VLAN, we use VLAN id of
VLAN device.

This patch fixes it for PF, enable/disable port based VLAN when calls
ndo_set_vf_vlan().

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:47:34 -07:00
Jian Shen 44e626f720 net: hns3: fix VLAN offload handle for VLAN inserted by port
Currently, in TX direction, driver implements the TX VLAN offload
by checking the VLAN header in skb, and filling it into TX descriptor.
Usually it works well, but if enable inserting VLAN header based on
port, it may conflict when out_tag field of TX descriptor is already
used, and cause RAS error.

In RX direction, hardware supports stripping max two VLAN headers.
For vlan_tci in skb can only store one VLAN tag, when RX VLAN offload
enabled, driver tells hardware to strip one VLAN header from RX
packet; when RX VLAN offload disabled, driver tells hardware not to
strip VLAN header from RX packet. Now if port based insert VLAN
enabled, all RX packets will have the port based VLAN header. This
header is useless for stack, driver needs to ask hardware to strip
it. Unfortunately, hardware can't drop this VLAN header, and always
fill it into RX descriptor, so driver has to identify and drop it.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:47:34 -07:00
Jian Shen 741fca1667 net: hns3: modify VLAN initialization to be compatible with port based VLAN
Our hardware supports inserting a specified VLAN header for each
function when sending packets. User can enable it with command
"ip link set <devname> vf  <vfid> vlan <vlan id>".
For this VLAN header is inserted by hardware, not from stack,
hardware also needs to strip it from received packets before
sending to stack.  In this case, driver needs to tell
hardware which VLAN to insert or strip.

The current VLAN initialization doesn't allow inserting
VLAN header by hardware, this patch modifies it, in order be
compatible with VLAN inserted base on port.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:47:34 -07:00
David S. Miller 81f2eeb370 Merge branch 'net-phy-shrink-PHY-settings-array-and-add-200Gbps-support'
Heiner Kallweit says:

====================
net: phy: shrink PHY settings array and add 200Gbps support

The definition of array settings[] is quite lengthy meanwhile. Add a
macro to shrink the definition.

When doing this I saw that the new 200Gbps modes and few 100Gbps/50Gbps
modes aren't supported in phylib yet. So add this.

To avoid ethtool and phylib mode definitions getting out of sync, add
a build bug to check for this.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:37:06 -07:00
Heiner Kallweit c6576bfe2f phy: warn if phylib and ethtool PHY mode definitions are out of sync
If new PHY modes are added people may miss to update all relevant places
in the kernel. Therefore add a build bug check for new modes in enum
ethtool_link_mode_bit_indices that haven't been added to phylib yet.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:37:06 -07:00
Heiner Kallweit 5a3144e419 net: phy: add support for new modes in phylib
Recently new modes have been added to ethtool.h, but the related
extension to phylib hasn't been done yet. So add support for these
modes.

v2:
- add missing 100Gbps and 50Gbps modes

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:37:06 -07:00
Heiner Kallweit f1538eca9e net: phy: shrink PHY settings array
The definition of array settings[] is quite lengthy meanwhile. Add a
macro to shrink the definition.

v2:
- Fix an indentation issue

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14 13:37:06 -07:00
David S. Miller 5fa7d3f9d3 Merge branch 'rhashtable-bit-locking-m68k'
NeilBrown says:

====================
Fix rhashtable bit-locking for m68k

As reported by Guenter Roeck, the new rhashtable bit-locking
doesn't work on m68k as it only requires 2-byte alignment, so BIT(1)
is addresses is not unused.

We current use BIT(0) to identify a NULLS marker, but that is only
needed in ->next pointers.  The bucket head does not need a NULLS
marker, so the lsb there can be used for locking.

the first 4 patches make some small improvements and re-arrange some
code.  The final patch converts to using only BIT(0) for these two
different special purposes.

I had previously suggested dropping the series until I fix it.  Given
that this was fairly easy, I retract that I think it best simply to
add these patches to fix the code.
====================

Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:34:46 -07:00
NeilBrown ca0b709d1a rhashtable: use BIT(0) for locking.
As reported by Guenter Roeck, the new bit-locking using
BIT(1) doesn't work on the m68k architecture.  m68k only requires
2-byte alignment for words and longwords, so there is only one
unused bit in pointers to structs - We current use two, one for the
NULLS marker at the end of the linked list, and one for the bit-lock
in the head of the list.

The two uses don't need to conflict as we never need the head of the
list to be a NULLS marker - the marker is only needed to check if an
object has moved to a different table, and the bucket head cannot
move.  The NULLS marker is only needed in a ->next pointer.

As we already have different types for the bucket head pointer (struct
rhash_lock_head) and the ->next pointers (struct rhash_head), it is
fairly easy to treat the lsb differently in each.

So: Initialize buckets heads to NULL, and use the lsb for locking.
When loading the pointer from the bucket head, if it is NULL (ignoring
the lock big), report as being the expected NULLS marker.
When storing a value into a bucket head, if it is a NULLS marker,
store NULL instead.

And convert all places that used bit 1 for locking, to use bit 0.

Fixes: 8f0db01800 ("rhashtable: use bit_spin_locks to protect hash bucket.")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:34:45 -07:00
NeilBrown f4712b46a5 rhashtable: replace rht_ptr_locked() with rht_assign_locked()
The only times rht_ptr_locked() is used, it is to store a new
value in a bucket-head.  This is the only time it makes sense
to use it too.  So replace it by a function which does the
whole task:  Sets the lock bit and assigns to a bucket head.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:34:45 -07:00
NeilBrown adc6a3ab19 rhashtable: move dereference inside rht_ptr()
Rather than dereferencing a pointer to a bucket and then passing the
result to rht_ptr(), we now pass in the pointer and do the dereference
in rht_ptr().

This requires that we pass in the tbl and hash as well to support RCU
checks, and means that the various rht_for_each functions can expect a
pointer that can be dereferenced without further care.

There are two places where we dereference a bucket pointer
where there is no testable protection - in each case we know
that we much have exclusive access without having taken a lock.
The previous code used rht_dereference() to pretend that holding
the mutex provided protects, but holding the mutex never provides
protection for accessing buckets.

So instead introduce rht_ptr_exclusive() that can be used when
there is known to be exclusive access without holding any locks.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:34:45 -07:00
NeilBrown c5783311a1 rhashtable: reorder some inline functions and macros.
This patch only moves some code around, it doesn't
change the code at all.
A subsequent patch will benefit from this as it needs
to add calls to functions which are now defined before the
call-site, but weren't before.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:34:45 -07:00
NeilBrown e4edbe3c1f rhashtable: fix some __rcu annotation errors
With these annotations, the rhashtable now gets no
warnings when compiled with "C=1" for sparse checking.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:34:45 -07:00
Gustavo A. R. Silva c252aa3e8e rhashtable: use struct_size() in kvzalloc()
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along with
memory for some number of elements for that array.  For example:

struct foo {
    int stuff;
    struct boo entry[];
};

size = sizeof(struct foo) + count * sizeof(struct boo);
instance = kvzalloc(size, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

instance = kvzalloc(struct_size(instance, entry, count), GFP_KERNEL);

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:31:33 -07:00
David S. Miller 9d60f0ea1c Merge branch 'nfp-update-to-control-structures'
Jakub Kicinski says:

====================
nfp: update to control structures

This series prepares NFP control structures for crypto offloads.
So far we mostly dealt with configuration requests under rtnl lock.
This will no longer be the case with crypto.  Additionally we will
try to reuse the BPF control message format, so we move common code
out of BPF.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:29:15 -07:00
Jakub Kicinski bcf0cafab4 nfp: split out common control message handling code
BPF's control message handler seems like a good base to built
on for request-reply control messages.  Split it out to allow
for reuse.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:29:15 -07:00
Jakub Kicinski 0a72d8332c nfp: move vNIC reset before netdev init
During probe we clear vNIC configuration in case the device
wasn't closed cleanly by previous driver.  Move that code
before netdev init, so netdev init can already try to apply
its config parameters.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:29:15 -07:00
Jakub Kicinski dd5b2498d8 nfp: add a mutex lock for the vNIC ctrl BAR
Soon we will try to write to the vNIC mailbox without RTNL held.
Add a new mutex to protect access to specific parts of the PCI
control BAR.

Move the mailbox size checking to the mailbox lock() helper, where
it can be more effective (happen prior to potential overwrite of
other data).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:29:15 -07:00
Dirk van der Merwe e64718282c nfp: opportunistically poll for reconfig result
If the reconfig was a quick update, we could have results available from
firmware within 200us.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:29:15 -07:00
David Ahern 1deeb6408c ipv6: Remove flowi6_oif compare from __ip6_route_redirect
In the review of 0b34eb0043 ("ipv6: Refactor __ip6_route_redirect"),
Martin noted that the flowi6_oif compare is moved to the new helper and
should be removed from __ip6_route_redirect. Fix the oversight.

Fixes: 0b34eb0043 ("ipv6: Refactor __ip6_route_redirect")
Reported-by: Martin Lau <kafai@fb.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 17:05:38 -07:00
David S. Miller 8c5a3ca306 Merge branch 'netdevsim-Mostly-cleanup-in-sdev-bpf-iface-area'
Jiri Pirko says:

====================
netdevsim: Mostly cleanup in sdev/bpf iface area

This patches does mainly internal netdevsim code shuffle. Nothing
serious, just small changes to help readability and preparations for
future work.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 16:49:54 -07:00
Jiri Pirko 4b3a84bce4 netdevsim: move sdev-specific init/uninit code into separate functions
In order to improve readability and prepare for future code changes,
move sdev specific init/uninit code into separate functions.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 16:49:54 -07:00
Jiri Pirko b26b6946a6 netdevsim: make bpf_offload_dev_create() per-sdev instead of first ns
offload dev is stored in sdev struct. However, first netdevsim instance
is used as a priv. Change this to be sdev to as it is shared among
multiple netdevsim instances.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 16:49:54 -07:00
Jiri Pirko 38f58c9723 netdevsim: move sdev specific bpf debugfs files to sdev dir
Some netdevsim bpf debugfs files are per-sdev, yet they are defined per
netdevsim instance. Move them under sdev directory.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 16:49:54 -07:00
Jiri Pirko af9095f00d netdevsim: move shared dev creation and destruction into separate file
To make code easier to read, move shared dev bits into a separate file.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 16:49:54 -07:00
Ioana Ciornei 3c91d11483 Documentation: net: dsa: transition to the rst format
This patch also performs some minor adjustments such as numbering for
the receive path sequence, conversion of keywords to inline literals and
adding an index page so it looks better in the output of 'make htmldocs'.

Signed-off-by: Ioana Ciornei <ciorneiioana@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 16:48:35 -07:00
Julian Wiedmann 056b21fbe6 net: veth: use generic helper to report timestamping info
For reporting the common set of SW timestamping capabilities, use
ethtool_op_get_ts_info() instead of re-implementing it.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 16:26:37 -07:00
Julian Wiedmann af730342ec net: loopback: use generic helper to report timestamping info
For reporting the common set of SW timestamping capabilities, use
ethtool_op_get_ts_info() instead of re-implementing it.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 16:26:37 -07:00
Julian Wiedmann abe9fd5726 net: dummy: use generic helper to report timestamping info
For reporting the common set of SW timestamping capabilities, use
ethtool_op_get_ts_info() instead of re-implementing it.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 16:26:37 -07:00
David S. Miller e0a092ebeb Merge branch 'smc-next'
Ursula Braun says:

====================
net/smc: patches 2019-04-12

here are patches for SMC:
* patch 1 improves behavior of non-blocking connect
* patches 2, 3, 5, 7, and 8 improve connecting return codes
* patches 4 and 6 are a cleanups without functional change
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 10:50:56 -07:00
Karsten Graul 7a62725a50 net/smc: improve smc_conn_create reason codes
Rework smc_conn_create() to always return a valid DECLINE reason code.
This removes the need to translate the return codes on 4 different
places and allows to easily add more detailed return codes by changing
smc_conn_create() only.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 10:50:56 -07:00
Karsten Graul 9aa68d298c net/smc: improve smc_listen_work reason codes
Rework smc_listen_work() to provide improved reason codes when an
SMC connection is declined. This allows better debugging on user side.
This also adds 3 more detailed reason codes in smc_clc.h to indicate
what type of device was not found (ism or rdma or both), or if ism
cannot talk to the peer.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 10:50:56 -07:00
Karsten Graul 228bae05be net/smc: code cleanup smc_listen_work
In smc_listen_work() the variables rc and reason_code are defined which
have the same meaning. Eliminate reason_code in favor of the shorter
name rc. No functional changes.
Rename the functions smc_check_ism() and smc_check_rdma() into
smc_find_ism_device() and smc_find_rdma_device() to make there purpose
more clear. No functional changes.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 10:50:56 -07:00
Karsten Graul fba7e8ef51 net/smc: cleanup of get vlan id
The vlan_id of the underlying CLC socket was retrieved two times
during processing of the listen handshaking. Change this to get the
vlan id one time in connect and in listen processing, and reuse the id.
And add a new CLC DECLINE return code for the case when the retrieval
of the vlan id failed.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 10:50:56 -07:00
Karsten Graul bc36d2fc93 net/smc: consolidate function parameters
During initialization of an SMC socket a lot of function parameters need
to get passed down the function call path. Consolidate the parameters
in a helper struct so there are less enough parameters to get all passed
by register.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 10:50:56 -07:00
Karsten Graul 598866974c net/smc: check for ip prefix and subnet
The check for a matching ip prefix and subnet was only done for SMC-R
in smc_listen_rdma_check() but not when an SMC-D connection was
possible. Rename the function into smc_listen_prfx_check() and move its
call to a place where it is called for both SMC variants.
And add a new CLC DECLINE reason for the case when the IP prefix or
subnet check fails so the reason for the failing SMC connection can be
found out more easily.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 10:50:56 -07:00
Karsten Graul 4ada81fddf net/smc: fallback to TCP after connect problems
Correct the CLC decline reason codes for internal problems to not have
the sign bit set, negative reason codes are interpreted as not eligible
for TCP fallback.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 10:50:56 -07:00
Ursula Braun 50717a37db net/smc: nonblocking connect rework
For nonblocking sockets move the kernel_connect() from the connect
worker into the initial smc_connect part to return kernel_connect()
errors other than -EINPROGRESS to user space.

Reviewed-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 10:50:56 -07:00
Dongli Zhang 6dc400af21 xen-netback: add reference from xenvif to backend_info to facilitate coredump analysis
During coredump analysis, it is not easy to obtain the address of
backend_info in xen-netback.

So far there are two ways to obtain backend_info:

1. Do what xenbus_device_find() does for vmcore to find the xenbus_device
and then derive it from dev_get_drvdata().

2. Extract backend_info from callstack of xenwatch (e.g., netback_remove()
or frontend_changed()).

This patch adds a reference from xenvif to backend_info so that it would be
much more easier to obtain backend_info during coredump analysis.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-12 10:10:28 -07:00
David S. Miller 8af9f7291e Merge branch 'sctp-skb-list'
David Miller says:

====================
SCTP: Event skb list overhaul.

This patch series eliminates the explicit reference to the skb list
implementation via skb->prev dereferences.

The approach used is to pass a non-empty skb list around instead of an
event skb object which may or may not be on a list.

I'd like to thank Marcelo Leitner, Xin Long, and Neil Horman for
reviewing previous versions of this series.

Testing would be very much appreciated, in addition to the review of
course.

v4 --> v5: Rebase to net-next

v3 --> v4: Fix the logic in patch #4 so that we don't miss cases
           where we should add event to the on-stack temp list.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-11 21:33:37 -07:00
David Miller 013b96ec64 sctp: Pass sk_buff_head explicitly to sctp_ulpq_tail_event().
Now the SKB list implementation assumption can be removed.

And now that we know that the list head is always non-NULL
we can remove the code blocks dealing with that as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-11 21:33:31 -07:00
David Miller 178ca044aa sctp: Make sctp_enqueue_event tak an skb list.
Pass this, instead of an event.  Then everything trickles down and we
always have events a non-empty list.

Then we needs a list creating stub to place into .enqueue_event for sctp_stream_interleave_1.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-11 21:33:31 -07:00