anon_inode_getfd() should be used *ONLY* in situations when we are
guaranteed to be past the last failure point (including copying the
descriptor number to userland, at that). And ksys_close() should
not be used for cleanups at all.
anon_inode_getfile() is there for all nontrivial cases like that.
Just use that...
Fixes: b3e5838252 ("clone: add CLONE_PIDFD")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Jann Horn <jannh@google.com>
Signed-off-by: Christian Brauner <christian@brauner.io>
When an application is run that:
a) Sets its scheduler to be SCHED_FIFO
and
b) Opens a memory mapped AF_PACKET socket, and sends frames with the
MSG_DONTWAIT flag cleared, its possible for the application to hang
forever in the kernel. This occurs because when waiting, the code in
tpacket_snd calls schedule, which under normal circumstances allows
other tasks to run, including ksoftirqd, which in some cases is
responsible for freeing the transmitted skb (which in AF_PACKET calls a
destructor that flips the status bit of the transmitted frame back to
available, allowing the transmitting task to complete).
However, when the calling application is SCHED_FIFO, its priority is
such that the schedule call immediately places the task back on the cpu,
preventing ksoftirqd from freeing the skb, which in turn prevents the
transmitting task from detecting that the transmission is complete.
We can fix this by converting the schedule call to a completion
mechanism. By using a completion queue, we force the calling task, when
it detects there are no more frames to send, to schedule itself off the
cpu until such time as the last transmitted skb is freed, allowing
forward progress to be made.
Tested by myself and the reporter, with good results
Change Notes:
V1->V2:
Enhance the sleep logic to support being interruptible and
allowing for honoring to SK_SNDTIMEO (Willem de Bruijn)
V2->V3:
Rearrage the point at which we wait for the completion queue, to
avoid needing to check for ph/skb being null at the end of the loop.
Also move the complete call to the skb destructor to avoid needing to
modify __packet_set_status. Also gate calling complete on
packet_read_pending returning zero to avoid multiple calls to complete.
(Willem de Bruijn)
Move timeo computation within loop, to re-fetch the socket
timeout since we also use the timeo variable to record the return code
from the wait_for_complete call (Neil Horman)
V3->V4:
Willem has requested that the control flow be restored to the
previous state. Doing so lets us eliminate the need for the
po->wait_on_complete flag variable, and lets us get rid of the
packet_next_frame function, but introduces another complexity.
Specifically, but using the packet pending count, we can, if an
applications calls sendmsg multiple times with MSG_DONTWAIT set, each
set of transmitted frames, when complete, will cause
tpacket_destruct_skb to issue a complete call, for which there will
never be a wait_on_completion call. This imbalance will lead to any
future call to wait_for_completion here to return early, when the frames
they sent may not have completed. To correct this, we need to re-init
the completion queue on every call to tpacket_snd before we enter the
loop so as to ensure we wait properly for the frames we send in this
iteration.
Change the timeout and interrupted gotos to out_put rather than
out_status so that we don't try to free a non-existant skb
Clean up some extra newlines (Willem de Bruijn)
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now in sctp_endpoint_init(), it holds the sk then creates auth
shkey. But when the creation fails, it doesn't release the sk,
which causes a sk defcnf leak,
Here to fix it by only holding the sk when auth shkey is created
successfully.
Fixes: a29a5bd4f5 ("[SCTP]: Implement SCTP-AUTH initializations.")
Reported-by: syzbot+afabda3890cc2f765041@syzkaller.appspotmail.com
Reported-by: syzbot+276ca1c77a19977c0130@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Palmer Dabbelt says:
====================
net: macb: Fix compilation on systems without COMMON_CLK, v2
Our patch to add support for the FU540-C000 broke compilation on at
least powerpc allyesconfig, which was found as part of the linux-next
build regression tests. This must have somehow slipped through the
cracks, as the patch has been reverted in linux-next for a while now.
This patch applies on top of the offending commit, which is the only one
I've even tried it on as I'm not sure how this subsystem makes it to
Linus.
This patch set fixes the issue by adding a dependency of COMMON_CLK to
the MACB Kconfig entry, which avoids the build failure by disabling MACB
on systems where it wouldn't compile. All known users of MACB have
COMMON_CLK, so this shouldn't cause any issues. This is a significantly
simpler approach than disabling just the FU540-C000 support.
I've also included a second patch to indicate this is a driver for a
Cadence device that was originally written by an engineer at Atmel. The
only relation is that I stumbled across it when writing the first patch.
Changes since v1 <20190624061603.1704-1-palmer@sifive.com>:
* Disable MACB on systems without COMMON_CLK, instead of just disabling
the FU540-C000 support on these systems.
* Update the commit message to reflect the driver was written by Atmel.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The help text makes it look like NET_VENDOR_CADENCE enables support for
Atmel devices, when in reality it's a driver written by Atmel that
supports Cadence devices. This may confuse users that have this device
on a non-Atmel SoC.
The fix is just s/Atmel/Cadence/, but I did go and re-wrap the Kconfig
help text as that change caused it to go over 80 characters.
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit c218ad5590 ("macb: Add support for SiFive FU540-C000") added a
dependency on the common clock framework to the macb driver, but didn't
express that dependency in Kconfig. As a result macb now fails to
compile on systems without COMMON_CLK, which specifically causes a build
failure on powerpc allyesconfig.
This patch adds the dependency, which results in the macb driver no
longer being selectable on systems without the common clock framework.
All known systems that have this device already support the common clock
framework, so this should not cause trouble for any uses. Supporting
both the FU540-C000 and systems without COMMON_CLK is quite ugly.
I've build tested this on powerpc allyesconfig and RISC-V defconfig
(which selects MACB), but I have not even booted the resulting kernels.
Fixes: c218ad5590 ("macb: Add support for SiFive FU540-C000")
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel says:
====================
ipv6: fix neighbour resolution with raw socket
The first patch prepares the fix, it constify rt6_nexthop().
The detail of the bug is explained in the second patch.
v1 -> v2:
- fix compilation warnings
- split the initial patch
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The scenario is the following: the user uses a raw socket to send an ipv6
packet, destinated to a not-connected network, and specify a connected nh.
Here is the corresponding python script to reproduce this scenario:
import socket
IPPROTO_RAW = 255
send_s = socket.socket(socket.AF_INET6, socket.SOCK_RAW, IPPROTO_RAW)
# scapy
# p = IPv6(src='fd00💯:1', dst='fd00:200::fa')/ICMPv6EchoRequest()
# str(p)
req = b'`\x00\x00\x00\x00\x08:@\xfd\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xfd\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xfa\x80\x00\x81\xc0\x00\x00\x00\x00'
send_s.sendto(req, ('fd00:175::2', 0, 0, 0))
fd00:175::/64 is a connected route and fd00:200::fa is not a connected
host.
With this scenario, the kernel starts by sending a NS to resolve
fd00:175::2. When it receives the NA, it flushes its queue and try to send
the initial packet. But instead of sending it, it sends another NS to
resolve fd00:200::fa, which obvioulsy fails, thus the packet is dropped. If
the user sends again the packet, it now uses the right nh (fd00:175::2).
The problem is that ip6_dst_lookup_neigh() uses the rt6i_gateway, which is
:: because the associated route is a connected route, thus it uses the dst
addr of the packet. Let's use rt6_nexthop() to choose the right nh.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no functional change in this patch, it only prepares the next one.
rt6_nexthop() will be used by ip6_dst_lookup_neigh(), which uses const
variables.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reported-by: kbuild test robot <lkp@intel.com>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace gpiod_set_value() with gpiod_set_value_cansleep(), as the switch
reset GPIO can be connected to e.g. I2C GPIO expander and it is perfectly
fine for the kernel to sleep for a bit in ksz_switch_register().
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The longstanding prohibition against using 0.0.0.0/8 dates back
to two issues with the early internet.
There was an interoperability problem with BSD 4.2 in 1984, fixed in
BSD 4.3 in 1986. BSD 4.2 has long since been retired.
Secondly, addresses of the form 0.x.y.z were initially defined only as
a source address in an ICMP datagram, indicating "node number x.y.z on
this IPv4 network", by nodes that know their address on their local
network, but do not yet know their network prefix, in RFC0792 (page
19). This usage of 0.x.y.z was later repealed in RFC1122 (section
3.2.2.7), because the original ICMP-based mechanism for learning the
network prefix was unworkable on many networks such as Ethernet (which
have longer addresses that would not fit into the 24 "node number"
bits). Modern networks use reverse ARP (RFC0903) or BOOTP (RFC0951)
or DHCP (RFC2131) to find their full 32-bit address and CIDR netmask
(and other parameters such as default gateways). 0.x.y.z has had
16,777,215 addresses in 0.0.0.0/8 space left unused and reserved for
future use, since 1989.
This patch allows for these 16m new IPv4 addresses to appear within
a box or on the wire. Layer 2 switches don't care.
0.0.0.0/32 is still prohibited, of course.
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: John Gilmore <gnu@toad.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In configuration of vlan over bridge over aquantia device
it was found that vlan tagged traffic is dropped on chip.
The reason is that bridge device enables promisc mode,
but in atlantic chip vlan filters will still apply.
So we have to corellate promisc settings with vlan configuration.
The solution is to track in a separate state variable the
need of vlan forced promisc. And also consider generic
promisc configuration when doing vlan filter config.
Fixes: 7975d2aff5 ("net: aquantia: add support of rx-vlan-filter offload")
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dst_default_metrics has all of the metrics initialized to 0, so nothing
will be added to the skb in rtnetlink_put_metrics. Avoid the loop if
metrics is from dst_default_metrics.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Puranjay Mohan says:
====================
net: fddi: skfp: Use PCI generic definitions instead of private duplicates
This patch series removes the private duplicates of PCI definitions in
favour of generic definitions defined in pci_regs.h.
This driver only uses some of the generic PCI definitons,
which are included from pci_regs.h and thier private versions
are removed from skfbi.h with all other private defines.
The skfbi.h defines PCI_REV_ID and other private defines with different
names, these are renamed to Generic PCI names to make them
compatible with defines in pci_regs.h.
All unused defines are removed from skfbi.h.
Changes in v5:
Removed unused PCI definitions which were left in v4
Changes in v4:
Removed unused PCI definitions which were left in v3
Changes in v3:
Renamed all local PCI definitions to Generic names.
Corrected coding style mistakes.
Changes in v2:
Converted individual patches to a series.
Made sure that individual patches build correctly
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove unused private PCI definitions from skfbi.h because generic PCI
symbols are already included from pci_regs.h.
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include the uapi/linux/pci_regs.h header file which contains the generic
PCI defines.
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename the PCI_REV_ID and other local defines to Generic PCI define names
in skfbi.h and drvfbi.c to make it compatible with the pci_regs.h.
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Multicast or broadcast egress packets have rt_iif set to the oif. These
packets might be recirculated back as input and lookup to the raw
sockets may fail because they are bound to the incoming interface
(skb_iif). If rt_iif is not zero, during the lookup, inet_iif() function
returns rt_iif instead of skb_iif. Hence, the lookup fails.
v2: Make it non vrf specific (David Ahern). Reword the changelog to
reflect it.
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the ingress ACL rules save vhca id and vport number to packet's
metadata REG_C_0, and the metadata matching for the rules in both fast
path and slow path are all added, enable this feature if supported.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
If vport metadata matching is enabled in eswitch, the rule created
must be changed to match on the metadata, instead of source port.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In slow path, packet that not matched by any offloaded rule is
forwarded to eswitch vport manager for further processing.
Add matching on metadata for peer miss rules in FDB, and rules which
forward packet to correct representor in esw manager NIC_RX table.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In order to do matching on metadata in slow path when demuxing traffic
to representors, explicitly enable the feature that allows HW to pass
metadata REG_C_0 from FDB to eswitch manager NIC_RX table.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add esw vport query and modify functions, and exposing them is needed for
enabling or disabling registers passed as metatdata to vport NIC_RX table
in slow path.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
If FW's capabilities and configurations meet the requirement of vport
metadata matching, this feature will be used. As the information
about vport number and vhca_id related to packet is already stored to
its metadata register, which is used as an indicator for perticular
vport, now we can change to match on this metadata for all the
offloading rules in fast path.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In vport metadata matching, source port number is replaced by metadata.
While FW has no idea about what it is in the metadata, a syndrome will
happen. Specify a known origin to avoid the syndrome.
However, there is no functional change because ANY_VPORT (0) is filled
in flow_source, the same default value as before, as a pre-step towards
metadata matching for fast path.
There are two other values can be filled in flow_source. When setting
0x1, packet matching this rule is from uplink, while 0x2 is for packet
from other local vports.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When a dual-port VHCA sends a RoCE packet on its non-native port, and the
packet arrives to its affiliated vport FDB, a mismatch might occur on the
rules that match the packet source vport as it is not represented by single
VHCA only in this case. So we change to match on metadata instead of source
vport.
To do that, a rule is created in all vports and uplink ingress ACLs, to
save the source vport number and vhca id in the packet's metadata in order
to match on it later.
The metadata register used is the first of the 32-bit type C registers. It
can be used for matching and header modify operations. The higher 16 bits
of this register are for vhca id, and the lower 16 ones is for vport
number.
This change is not for dual-port RoCE only. If HW and FW allow, the vport
metadata matching is enabled by default.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Refactor the flow data structures, add new flow_context and move
flow_tag into it, as flow_tag doesn't belong to the rule action.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Introduce a helper API mlx5_eswitch_is_vf_vport() to check
if a given vport_num belongs to VF or not.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
That modify header action can be then attached to a steering rule in
the ingress ACL.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The ingress and egress ACL root namespaces are created per vport and
stored into arrays. However, the vport number is not the same as the
index. Passing the array index, instead of vport number, to get the
correct ingress and egress acl namespace.
Fixes: 9b93ab981e ("net/mlx5: Separate ingress/egress namespaces for each vport")
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When a dual-port VHCA sends a RoCE packet on its non-native port, and
the packet arrives to its affiliated vport FDB, a mismatch might occur
on the rules that match the packet source vport. So we replace the
match on source port with the match on metadata that was configured in
ingress ACL, and that metadata will be passed further also to the NIC
RX table of the eswitch manager.
Introduce vport metadata matching bits and enum constants as a pre-step
towards metadata matching.
o metadata type C registers in the misc parameters 2 fields.
o esw_uplink_ingress_acl bit in esw cap. If it set, the device supports
ingress ACL for the uplink vport.
o fdb_to_vport_reg_* bits in flow table cap and esw vport context, to
support propagating the metadata to the nic rx through the loopback
path.
o flow_source in flow context, to indicate the known origin of packets.
o enum constants, to support the above bits.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
We should rather have vlan_tci filled all the way down
to the transmitting netdevice and let it do the hw/sw
vlan implementation.
Suggested-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
First set of patches for 5.3, but not that many patches this time.
This pull request fails to compile with the tip tree due to
ktime_get_boot_ns() API changes there. It should be easy for Linus to
fix it in p54 driver once he pulls this, an example resolution here:
https://lkml.kernel.org/r/20190625160432.533aa140@canb.auug.org.au
Major changes:
airo
* switch to use skcipher interface
p54
* support boottime in scan results
rtw88
* add fast xmit support
* add random mac address on scan support
rt2x00
* add software watchdog to detect hangs, it's disabled by default
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJdE5OGAAoJEG4XJFUm622bWCoH/1bLdkTYOkSatXxpHn2cVRvA
SrQ+ZSfOWJe5d0B+9HWhO5r+Savuu+IrgcC6vSIBBVglF5Tf6F0DAxHVrIwc9MSY
QUv4f1suqeH0ipUEBWJXoDXM2OEShvw5WCuY0ZnYw2hZRI7Sb5nqwJEUH57BtBCX
tFW22Ax3ZFTmCfexFgiwEjmtEx6HHz/nleYLMt9gg7X1Twug+QmhEsNw/27PfvMx
RyVyCJ1UvW2x4GHqDRlGxQoh7FlMvuVe/v/VcjGp2Fp8s7GS0xUyA11svGBFpdnT
K94Y3LOcTSdBQwScbn6O2v6EybB9PZYhg6rckwbMgIN2pkkWqXnh5avru4U0T9k=
=yq6Q
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-next-for-davem-2019-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valu says:
====================
wireless-drivers-next patches for 5.3
First set of patches for 5.3, but not that many patches this time.
This pull request fails to compile with the tip tree due to
ktime_get_boot_ns() API changes there. It should be easy for Linus to
fix it in p54 driver once he pulls this, an example resolution here:
https://lkml.kernel.org/r/20190625160432.533aa140@canb.auug.org.au
Major changes:
airo
* switch to use skcipher interface
p54
* support boottime in scan results
rtw88
* add fast xmit support
* add random mac address on scan support
rt2x00
* add software watchdog to detect hangs, it's disabled by default
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun says:
====================
net/smc: fixes 2019-06-26
here are 2 small smc fixes for the net tree.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If register_pernet_subsys success in smc_init,
we should cleanup it in case any other error.
Fixes: 64e28b52c7 (net/smc: add pnet table namespace support")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After smc_lgr_create(), the newly created link group is added
to smc_lgr_list, thus is accessible from other context.
Although link group creation is serialized by
smc_create_lgr_pending, the new link group may still be accessed
concurrently. For example, if ib_device is no longer active,
smc_ib_port_event_work() will call smc_port_terminate(), which
in turn will call __smc_lgr_terminate() on every link group of
this device. So conns_lock is required here.
Signed-off-by: Huaping Zhou <zhp@smail.nju.edu.cn>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct virtchnl_iwarp_qvlist_info {
...
struct virtchnl_iwarp_qv_info qv_info[1];
};
size = sizeof(struct virtchnl_iwarp_qvlist_info) + (sizeof(struct virtchnl_iwarp_qv_info) * count;
instance = kzalloc(size, GFP_KERNEL);
and
struct virtchnl_vf_resource {
...
struct virtchnl_vsi_resource vsi_res[1];
};
size = sizeof(struct virtchnl_vf_resource) + sizeof(struct virtchnl_vsi_resource) * count;
instance = kzalloc(size, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
instance = kzalloc(struct_size(instance, qv_info, count), GFP_KERNEL);
and
instance = kzalloc(struct_size(instance, vsi_res, count), GFP_KERNEL);
Notice that, in the first case above, variable size is not necessary, hence it
is removed.
This code was detected with the help of Coccinelle.
Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It was found that the string that prints our copyright was
not up to date. Updating to reflect our copyright.
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Changing descriptor count via 'ethtool -G' is not persistent across resets.
When PF reset occurs, we roll back to the default value of vsi->num_desc,
which is used then in i40e_alloc_rings to set descriptor count. XDP does a
PF reset so when user has changed the descriptor count and load XDP
program, the default count will be back there.
To fix this:
* introduce new VSI members - num_tx_desc and num_rx_desc in favour of
num_desc
* set them in i40e_set_ringparam to user's values
* set them to default values in i40e_set_num_rings_in_vsi only when they
don't have previous values
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes reading f/w LLDP agent status at DCB init time.
It's done by removing direct NVM reading in i40e_update_dcb_config()
and checking whether f/w LLDP agent is disabled via
I40E_FLAG_DISABLE_FW_LLDP flag in i40e_init_pf_dcb(). The function
i40e_update_dcb_config() in i40e_main.c is a temporary solution which
will be later renamed to i40e_init_dcb() in the i40e_dcb module. Also
logging was extended to make visible if f/w LLDP agent is running or not
and always log a message when DCB was not initialized. Without this
patch for new f/w versions f/w LLDP agent status was always read
from NVM as disabled and DCB initialization failed without
clear reason in logs.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Generate log entry when TC0 is created or deleted.
Log entry is generated during main VSI setup.
Before there was no log info about adding or deleting TC0.
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix for missing "Supported link modes" and "Advertised link modes"
info in ethtool after changed speed on X722 devices with BASE-T PHY
with FW API version >= 1.7.
The same FW API version on X710 and X722 does not mean the same
feature set so the change was needed as mac type of the device
should also be checked instead of FW API version only.
Signed-off-by: Martyna Szapar <martyna.szapar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes 'NIC Link is Up, Unknown bps' message in dmesg
for 2.5Gb/5Gb speeds. This problem is fixed by adding constants
for VIRTCHNL_LINK_SPEED_2_5GB and VIRTCHNL_LINK_SPEED_5GB cases
in the i40e_virtchnl_link_speed() function.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The next call to ixgbevf_update_itr will continue to dynamically
update ITR.
Copy from commit bdbeefe8ea ("ixgbe: fix possible divide by zero in
ixgbe_update_itr")
Signed-off-by: Young Xiao <92siuyang@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Some transceivers may comply with SFF-8472 but not implement the Digital
Diagnostic Monitoring (DDM) interface described in it. The existence of
such area is specified by bit 6 of byte 92, set to 1 if implemented.
Currently, due to not checking this bit ixgbe fails trying to read SFP
module's eeprom with the follow message:
ethtool -m enP51p1s0f0
Cannot get Module EEPROM data: Input/output error
Because it fails to read the additional 256 bytes in which it was assumed
to exist the DDM data.
This issue was noticed using a Mellanox Passive DAC PN 01FT738. The eeprom
data was confirmed by Mellanox as correct and present in other Passive
DACs in from other manufacturers.
Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If the PHY does not support EEE mode, then a crash is observed when the
ethernet interface is enabled. The crash occurs, because if the PHY does
not support EEE, then although the EEE timer is never configured, it is
still marked as enabled and so the stmmac ethernet driver is still
trying to update the timer by calling mod_timer(). This triggers a BUG()
in the mod_timer() because we are trying to update a timer when there is
no callback function set because timer_setup() was never called for this
timer.
The problem is caused because we return true from the function
stmmac_eee_init(), marking the EEE timer as enabled, even when we have
not configured the EEE timer. Fix this by ensuring that we return false
if the PHY does not support EEE and hence, 'eee_active' is not set.
Fixes: 74371272f9 ("net: stmmac: Convert to phylink and remove phylib logic")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When stmmac_eee_init() is called to disable EEE support, then the timer
for EEE support is stopped and we return from the function. Prior to
stopping the timer, a mutex was acquired but in this case it is never
released and so could cause a deadlock. Fix this by releasing the mutex
prior to returning from stmmax_eee_init() when stopping the EEE timer.
Fixes: 74371272f9 ("net: stmmac: Convert to phylink and remove phylib logic")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>