Trivial fix to spelling mistake in dev_err error message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
avoid to compute the hash value if dev is not null, since
hash value is not used
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/broadcom/genet/bcmgenet.c: In function 'bcmgenet_power_down':
drivers/net/ethernet/broadcom/genet/bcmgenet.c:1136:6: warning:
variable 'ret' set but not used [-Wunused-but-set-variable]
bcmgenet_power_down should return 'ret' instead of 0.
Fixes: ca8cf34190 ("net: bcmgenet: propagate errors from bcmgenet_power_down")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
net: sched: prepare for more Qdisc offloads
This series refactors the "switchdev" Qdisc offloads a little. We have
a few Qdiscs which can be fully offloaded today to the forwarding plane
of switching devices.
First patch adds a helper for handing statistic dumps, the code seems
to be copy pasted between PRIO and RED. Second patch removes unnecessary
parameter from RED offload function. Third patch makes the MQ offload
use the dump helper which helps it behave much like PRIO and RED when
it comes to the TCQ_F_OFFLOADED flag. Patch 4 adds a graft helper,
similar to the dump helper.
Patch 5 is unrelated to offloads, qdisc_graft() code seemed ripe for a
small refactor - no functional changes there.
Last two patches move the qdisc_put() call outside of the sch_tree_lock
section for RED and PRIO. The child Qdiscs will get removed from the
hierarchy under the lock, but having the put (and potentially destroy)
called outside of the lock helps offload which may choose to sleep,
and it should generally lower the Qdisc change impact.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Move destroying of the old child qdiscs outside of the sch_tree_lock()
section. This should improve the software qdisc replace but is even
more important for offloads. Calling offloads under a spin lock is
best avoided, and child's destroy would be called under sch_tree_lock().
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move destroying of the old child qdisc outside of the sch_tree_lock()
section. This should improve the software qdisc replace but is even
more important for offloads. Firstly calling offloads under a spin
lock is best avoided. Secondly the destroy event of existing child
would have been sent to the offload device before the replace, causing
confusion.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code for grafting Qdiscs when there is a parent has two needless
indentation levels, and breaks the "keep the success path unindented"
guideline. Refactor.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Qdisc graft operation of offload-capable qdiscs performs a few
extra steps which are identical among all the qdiscs. Add
a helper to share this code.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PRIO and RED mark the qdisc with TCQ_F_OFFLOADED upon successful offload,
make MQ do the same. The consistency will help with consistent
graft callback behaviour.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Offload dump helper does not use opt parameter, remove it.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Qdisc dump operation of offload-capable qdiscs performs a few
extra steps which are identical among all the qdiscs. Add
a helper to share this code.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit says:
====================
net: phy: improve and simplify phylib state machine
This patch series is based on two axioms:
- During autoneg a PHY always reports the link being down
- Info in clause 22/45 registers doesn't allow to differentiate between
these two states:
1. Link is physically down
2. A link partner is connected and PHY is autonegotiating
In both cases "link up" and "aneg finished" bits aren't set.
One consequence is that having separate states PHY_NOLINK and PHY_AN
isn't needed.
By using these two axioms the state machine can be significantly
simplified.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Use phy_check_link_status in more places in the state machine.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the recent changes in the state machine state PHY_AN isn't used
any longer and can be removed.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In few places in the state machine the state is set to PHY_RUNNING or
PHY_NOLINK after doing a phy_read_status(). So factor this out to
phy_check_link_status().
First use it in phy_start_aneg(): By setting the state to PHY_RUNNING
or PHY_NOLINK directly we can remove the code to handle the case that
we're using interrupts and aneg was finished already.
Definition of phy_link_up and phy_link_down needs to be moved because
they are called in the new function.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If aneg isn't finished yet then the PHY reports the link as down.
There's no benefit in setting the state to PHY_AN because the next
state machine run would set the status to PHY_NOLINK anyway (except
in the meantime aneg has been finished and link is up). Therefore
we can set the state to PHY_RUNNING or PHY_NOLINK directly.
In addition change the do_carrier parameter in phy_link_down() to true.
If carrier was marked as up before (what should never be the case because
PHY was in state PHY_HALTED before) then we should mark it as down now.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If aneg is enabled and the PHY reports the link as up then definitely
aneg finished successfully. Therefore this check is useless and
can be removed.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2018-11-07
This series contains updates to almost all of the Intel wired LAN
drivers.
Lance Roy replaces a spin lock with lockdep_assert_held() for igbvf
driver in move toward trying to remove spin_is_locked().
Colin Ian King fixes a potential null pointer dereference by adding a
check in ixgbe. Also fixed the igc driver by properly assigning the
return error code of a function call, so that we can properly check it.
Shannon Nelson updates the ixgbe driver to not block IPsec offload when
in VEPA mode, in VEB mode, IPsec offload is still blocked because the
device drops packets into a black hole.
Jake adds support for software timestamping for packets sent over
ixgbevf. Also modifies i40e, iavf, igb, igc, and ixgbe to delay calling
skb_tx_timestamp() to the latest point possible, which is just prior to
notifying the hardware of the new Tx packet.
Todd adds the new WoL filter flag so that we properly report that we do
not support this new feature.
YueHaibing from Huawei fixes the igc driver by cleaning up variables
that are not "really" used.
Dan Carpenter cleans up igc whitespace issues.
Miroslav Lichvar fixes e1000e for potential underflow issue in the
timecounter, so modify the driver to use timecounter_cyc2time() to allow
non-monotonic SYSTIM readings.
Sasha provides additional igc cleanups based on community feedback.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley says:
====================
nfp: add and use tunnel netdev helpers
A recent patch introduced the function netif_is_vxlan() to verify the
tunnel type of a given netdev as vxlan.
Add a similar function to detect geneve netdevs and make use of this
function in the NFP driver. Also make use of the vxlan helper where
applicable.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Offload of geneve decap rules is supported in NFP. Include geneve in the
check for supported types.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of the recently added VXLAN and geneve helper functions to
determine the type of the netdev from its rtnl_link_ops.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper function to determine if the type of a netdev is geneve based
on its rtnl_link_ops. This allows drivers that may wish to offload tunnels
to check the underlying type of the device.
A recent patch added a similar helper to vxlan.h
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Expose the MUM/SUC Firmware, UEFI Expansion ROM and MC Status partitions
of the NIC's NVRAM as MTDs if found on the NIC. The first two are needed
in order to properly update them when performing firmware updates; the MC
Status partition is used to determine whether a signed firmware image was
accepted or rejected by a Secure NIC.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław says:
====================
net/vlan: prepare for removal of VLAN_TAG_PRESENT
This is a preparatory patchset before removing the use of VLAN_TAG_PRESENT
bit in skb->vlan_tci as indication of VLAN offload. This set includes
only cleanups that allow abstracting of code testing VLAN tag presence
in drivers and networking code.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Set the backlog earlier in inet_dccp_listen() and inet_listen(),
then we can avoid the redundant setting.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
GSO tunneled packets are always segmented in software before they are
transmitted by a VLAN, even when the lower device can offload tunnel
encapsulation and VLAN together (i.e., some bits in NETIF_F_GSO_ENCAP_ALL
mask are set in the lower device 'vlan_features'). If we let VLANs have
the same tunnel offload capabilities as their lower device, throughput
can improve significantly when CPU is limited on the transmitter side.
- set NETIF_F_GSO_ENCAP_ALL bits in the VLAN 'hw_features', to ensure
that 'features' will have those bits zeroed only when the lower device
has no hardware support for tunnel encapsulation.
- for the same reason, copy GSO-related bits of 'hw_enc_features' from
lower device to VLAN, and ensure to update that value when the lower
device changes its features.
- set NETIF_F_HW_CSUM bit in the VLAN 'hw_enc_features' if 'real_dev'
is able to compute checksums at least for a kind of packets, like done
with commit 8403debeea ("vlan: Keep NETIF_F_HW_CSUM similar to other
software devices"). This avoids software segmentation due to mismatching
checksum capabilities between VLAN's 'features' and 'hw_enc_features'.
Reported-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tun XDP sendmsg code path, unconditionally computes the symmetric
hash of each packet for RFS's sake, even when we could skip it. e.g.
when the device has a single queue.
This change adds the check already in-place for the skb sendmsg path
to avoid unneeded hashing.
The above gives small, but measurable, performance gain for VM xmit
path when zerocopy is not enabled.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add byte queue limits support in the fsl_ucc_hdlc driver.
Signed-off-by: Mathias Thore <mathias.thore@infinera.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of listing every single PHYID, load the driver for every PHYID
with a Realtek OUI, independent of model number and revision.
This patch also improves two further aspects:
- constify realtek_tbl[]
- the mask should have been 0xffffffff instead of 0x001fffff so far,
by masking out some bits a PHY from another vendor could have been
matched
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
phy_trigger_machine() is used in phy.c only, so we can make it static.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for BCM7255 EPHY.
Signed-off-by: Justin Chen <justinpopo6@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni says:
====================
udp: implement GRO support
This series implements GRO support for UDP sockets, as the RX counterpart
of commit bec1f6f697 ("udp: generate gso with UDP_SEGMENT").
The core functionality is implemented by the second patch, introducing a new
sockopt to enable UDP_GRO, while patch 3 implements support for passing the
segment size to the user space via a new cmsg.
UDP GRO performs a socket lookup for each ingress packets and aggregate datagram
directed to UDP GRO enabled sockets with constant l4 tuple.
UDP GRO packets can land on non GRO-enabled sockets, e.g. due to iptables NAT
rules, and that could potentially confuse existing applications.
The solution adopted here is to de-segment the GRO packet before enqueuing
as needed. Since we must cope with packet reinsertion after de-segmentation,
the relevant code is factored-out in ipv4 and ipv6 specific helpers and exposed
to UDP usage.
While the current code can probably be improved, this safeguard ,implemented in
the patches 4-7, allows future enachements to enable UDP GSO offload on more
virtual devices eventually even on forwarded packets.
The last 4 for patches implement some performance and functional self-tests,
re-using the existing udpgso infrastructure. The problematic scenario described
above is explicitly tested.
This revision of the series try to address the feedback provided by Willem and
Subash on previous iteration.
====================
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extends the existing udp programs to allow checking for proper
GRO aggregation/GSO size, and run the tests via a shell script, using
a veth pair with XDP program attached to trigger the GRO code path.
rfc v3 -> v1:
- use ip route to attach the xdp helper to the veth
rfc v2 -> rfc v3:
- add missing test program options documentation
- fix sporatic test failures (receiver faster than sender)
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Run on top of veth pair, using a dummy XDP program to enable the GRO.
rfc v3 -> v1:
- use ip route to attach the xdp helper to the veth
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This trivial XDP program does nothing, but will be used by the
next patch to test the GRO path in a net namespace, leveraging
the veth XDP implementation.
It's added here, despite its 'net' usage, to avoid the duplication
of the llc-related makefile boilerplate.
rfc v3 -> v1:
- move the helper implementation into the bpf directory, don't
touch udpgso_bench_rx
rfc v2 -> rfc v3:
- move 'x' option handling here
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And fix a couple of buglets (port option processing,
clean termination on SIGINT). This is preparatory work
for GRO tests.
rfc v2 -> rfc v3:
- use ETH_MAX_MTU
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In some scenarios, the GRO engine can assemble an UDP GRO packet
that ultimately lands on a non GRO-enabled socket.
This patch tries to address the issue explicitly checking for the UDP
socket features before enqueuing the packet, and eventually segmenting
the unexpected GRO packet, as needed.
We must also cope with re-insertion requests: after segmentation the
UDP code calls the helper introduced by the previous patches, as needed.
Segmentation is performed by a common helper, which takes care of
updating socket and protocol stats is case of failure.
rfc v3 -> v1
- fix compile issues with rxrpc
- when gso_segment returns NULL, treat is as an error
- added 'ipv4' argument to udp_rcv_segment()
rfc v2 -> rfc v3
- moved udp_rcv_segment() into net/udp.h, account errors to socket
and ns, always return NULL or segs list
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So that we can re-use it at the UDP level in the next patch
rfc v3 -> v1:
- add the helper declaration into the ipv6 header
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So that we can re-use it at the UDP level in a later patch
rfc v3 -> v1
- add the helper declaration into the ip header
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When UDP GRO is enabled, the UDP_GRO cmsg will carry the ingress
datagram size. User-space can use such info to compute the original
packets layout.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the RX counterpart of commit bec1f6f697 ("udp: generate gso
with UDP_SEGMENT"). When UDP_GRO is enabled, such socket is also
eligible for GRO in the rx path: UDP segments directed to such socket
are assembled into a larger GSO_UDP_L4 packet.
The core UDP GRO support is enabled with setsockopt(UDP_GRO).
Initial benchmark numbers:
Before:
udp rx: 1079 MB/s 769065 calls/s
After:
udp rx: 1466 MB/s 24877 calls/s
This change introduces a side effect in respect to UDP tunnels:
after a UDP tunnel creation, now the kernel performs a lookup per ingress
UDP packet, while before such lookup happened only if the ingress packet
carried a valid internal header csum.
rfc v2 -> rfc v3:
- fixed typos in macro name and comments
- really enforce UDP_GRO_CNT_MAX, instead of UDP_GRO_CNT_MAX + 1
- acquire socket lock in UDP_GRO setsockopt
rfc v1 -> rfc v2:
- use a new option to enable UDP GRO
- use static keys to protect the UDP GRO socket lookup
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The *encap_needed static keys are enabled by UDP tunnels
and several UDP encapsulations type, but they are never
turned off. This can cause unneeded overall performance
degradation for systems where such features are used
transiently.
This patch introduces complete book-keeping for such keys,
decreasing the usage at socket destruction time, if needed,
and avoiding that the same socket could increase the key
usage multiple times.
rfc v3 -> v1:
- add socket lock around udp_tunnel_encap_enable()
rfc v2 -> rfc v3:
- use udp_tunnel_encap_enable() in setsockopt()
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mike Manning says:
====================
vrf: allow simultaneous service instances in default and other VRFs
Services currently have to be VRF-aware if they are using an unbound
socket. One cannot have multiple service instances running in the
default and other VRFs for services that are not VRF-aware and listen
on an unbound socket. This is because there is no easy way of isolating
packets received in the default VRF from those arriving in other VRFs.
This series provides this isolation for stream sockets subject to the
existing kernel parameter net.ipv4.tcp_l3mdev_accept not being set,
given that this is documented as allowing a single service instance to
work across all VRF domains. Similarly, net.ipv4.udp_l3mdev_accept is
checked for datagram sockets, and net.ipv4.raw_l3mdev_accept is
introduced for raw sockets. The functionality applies to UDP & TCP
services as well as those using raw sockets, and is for IPv4 and IPv6.
Example of running ssh instances in default and blue VRF:
$ /usr/sbin/sshd -D
$ ip vrf exec vrf-blue /usr/sbin/sshd
$ ss -ta | egrep 'State|ssh'
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 0.0.0.0%vrf-blue:ssh 0.0.0.0:*
LISTEN 0 128 0.0.0.0:ssh 0.0.0.0:*
ESTAB 0 0 192.168.122.220:ssh 192.168.122.1:50282
LISTEN 0 128 [::]%vrf-blue:ssh [::]:*
LISTEN 0 128 [::]:ssh [::]:*
ESTAB 0 0 [3000::2]%vrf-blue:ssh [3000::9]:45896
ESTAB 0 0 [2000::2]:ssh [2000::9]:46398
v1:
- Address Paolo Abeni's comments (patch 4/5)
- Fix build when CONFIG_NET_L3_MASTER_DEV not defined (patch 1/5)
v2:
- Address David Aherns' comments (patches 4/5 and 5/5)
- Remove patches 3/5 and 5/5 from series for individual submissions
- Include a sysctl for raw sockets as recommended by David Ahern
- Expand series into 10 patches and provide improved descriptions
v3:
- Update description for patch 1/10 and remove patch 6/10
v4:
- Set default to enabled for raw socket sysctl as recommended by David Ahern
v5:
- Address review comments from David Ahern in patches 2-5
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
For bound udp sockets in a vrf, also check the sdif to get the index
for ingress devices enslaved to an l3mdev.
Signed-off-by: Dewi Morgan <morgand@vyatta.att-mail.com>
Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>