Pull networking fixes from David Miller:
1) The forcedeth conversion from pci_*() DMA interfaces to dma_*() ones
missed one spot. From Zhu Yanjun.
2) Missing CRYPTO_SHA256 Kconfig dep in cfg80211, from Johannes Berg.
3) Fix checksum offloading in thunderx driver, from Sunil Goutham.
4) Add SPDX to vm_sockets_diag.h, from Stephen Hemminger.
5) Fix use after free of packet headers in TIPC, from Jon Maloy.
6) "sizeof(ptr)" vs "sizeof(*ptr)" bug in i40e, from Gustavo A R Silva.
7) Tunneling fixes in mlxsw driver, from Petr Machata.
8) Fix crash in fanout_demux_rollover() of AF_PACKET, from Mike
Maloney.
9) Fix race in AF_PACKET bind() vs. NETDEV_UP notifier, from Eric
Dumazet.
10) Fix regression in sch_sfq.c due to one of the timer_setup()
conversions. From Paolo Abeni.
11) SCTP does list_for_each_entry() using wrong struct member, fix from
Xin Long.
12) Don't use big endian netlink attribute read for
IFLA_BOND_AD_ACTOR_SYSTEM, it is in cpu endianness. Also from Xin
Long.
13) Fix mis-initialization of q->link.clock in CBQ scheduler, preventing
adding filters there. From Jiri Pirko.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (67 commits)
ethernet: dwmac-stm32: Fix copyright
net: via: via-rhine: use %p to format void * address instead of %x
net: ethernet: xilinx: Mark XILINX_LL_TEMAC broken on 64-bit
myri10ge: Update MAINTAINERS
net: sched: cbq: create block for q->link.block
atm: suni: remove extraneous space to fix indentation
atm: lanai: use %p to format kernel addresses instead of %x
VSOCK: Don't set sk_state to TCP_CLOSE before testing it
atm: fore200e: use %pK to format kernel addresses instead of %x
ambassador: fix incorrect indentation of assignment statement
vxlan: use __be32 type for the param vni in __vxlan_fdb_delete
bonding: use nla_get_u64 to extract the value for IFLA_BOND_AD_ACTOR_SYSTEM
sctp: use right member as the param of list_for_each_entry
sch_sfq: fix null pointer dereference at timer expiration
cls_bpf: don't decrement net's refcount when offload fails
net/packet: fix a race in packet_bind() and packet_notifier()
packet: fix crash in fanout_demux_rollover()
sctp: remove extern from stream sched
sctp: force the params with right types for sctp csum apis
sctp: force SCTP_ERROR_INV_STRM with __u32 when calling sctp_chunk_fail
...
Don't use %x and casting to print out an address, instead use %p
and remove the casting. Cleans up smatch warnings:
drivers/net/ethernet/via/via-rhine.c:998 rhine_init_one_common()
warn: argument 4 to %lx specifier is cast from pointer
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On 64-bit (e.g. powerpc64/allmodconfig):
drivers/net/ethernet/xilinx/ll_temac_main.c: In function 'temac_start_xmit_done':
drivers/net/ethernet/xilinx/ll_temac_main.c:633:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
dev_kfree_skb_irq((struct sk_buff *)cur_p->app4);
^
cdmac_bd.app4 is u32, so it is too small to hold a kernel pointer.
Note that several other fields in struct cdmac_bd are also too small to
hold physical addresses on 64-bit platforms.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
All callers of __vxlan_fdb_delete pass vni with __be32 type, and
this param should be declared as __be32 type.
Fixes: 3ad7a4b141 ("vxlan: support fdb and learning in COLLECT_METADATA mode")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bond_opt_initval expects a u64 type param, it's better to use
nla_get_u64 to extract the value here, to eliminate a sparse
endianness mismatch warning.
Fixes: 171a42c38c ("bonding: add netlink support for sys prio, actor sys mac, and port key")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix coccicheck warning which recommends to use memdup_user():
drivers/net/wan/lmc/lmc_main.c:497:27-34: WARNING opportunity for memdup_user
Generated by: scripts/coccinelle/memdup_user/memdup_user.cocci
Signed-off-by: Vasyl Gomonovych <gomonovych@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Error code returned by 'bnxt_read_sfp_module_eeprom_info()' is handled a
few lines above when reading the A0 portion of the EEPROM.
The same should be done when reading the A2 portion of the EEPROM.
In order to correctly propagate an error, update 'rc' in this 2nd call as
well, otherwise 0 (success) is returned.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Marvell 10G PHY driver supports different hardware revisions, which
have their bits 3..0 differing. To get the correct revision number these
bits should be ignored. This patch fixes this by using the already
defined MARVELL_PHY_ID_MASK (0xfffffff0) instead of the custom
0xffffffff mask.
Fixes: 20b2af32ff ("net: phy: add Marvell Alaska X 88X3310 10Gigabit PHY support")
Suggested-by: Yan Markman <ymarkman@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the Tx ring size checks when using ethtool, by adding
an extra check in the PPv2 check_ringparam_valid helper. The Tx ring
size cannot be set to a value smaller than the minimum number of
descriptors needed for TSO.
Fixes: 1d17db08c0 ("net: mvpp2: limit TSO segments and use stop/wake thresholds")
Suggested-by: Yan Markman <ymarkman@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Short fragmented packets may never be sent by the hardware when padding
is disabled. This patch stop modifying the GMAC padding bits, to leave
them to their reset value (disabled).
Fixes: 3919357fb0 ("net: mvpp2: initialize the GMAC when using a port")
Signed-off-by: Yan Markman <ymarkman@marvell.com>
[Antoine: commit message]
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patches fixes the probe error path by cleaning up probed ports, to
avoid leaving registered net devices when the driver failed to probe.
Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an allocation in the txq_init path fails, the allocated buffers
end-up being freed twice: in the txq_init error path, and in txq_deinit.
This lead to issues as txq_deinit would work on already freed memory
regions:
kernel BUG at mm/slub.c:3915!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
This patch fixes this by removing the txq_init own error path, as the
txq_deinit function is always called on errors. This was introduced by
TSO as way more buffers are allocated.
Fixes: 186cd4d4e4 ("net: mvpp2: software tso support")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function mlxsw_sp_nexthop_rif_update() walks the list of nexthops
associated with a RIF, and updates the corresponding entries in the
switch. It is used in particular when a tunnel underlay netdevice moves
to a different VRF, and all the nexthops are migrated over to a new RIF.
The problem is that each nexthop holds a reference to its RIF, and that
is not updated. So after the old RIF is gone, further activity on these
nexthops (such as downing the underlay netdevice) dereferences a
dangling pointer.
Fix the issue by updating rif of impacted nexthops before calling
mlxsw_sp_nexthop_rif_update().
Fixes: 0c5f1cd5ba ("mlxsw: spectrum_router: Generalize __mlxsw_sp_ipip_entry_update_tunnel()")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some tunnels that are offloadable on their own can nonetheless be
demoted to slow path if their local address is in conflict with that of
another tunnel. When a route is formed for such a tunnel,
mlxsw_sp_nexthop_ipip_init() fails to find the corresponding IPIP entry,
and that triggers a FIB abort.
Resolve the problem by not assuming that a tunnel for which
mlxsw_sp_ipip_ops.can_offload() holds also automatically has an IPIP
entry.
Fixes: af641713e9 ("mlxsw: spectrum_router: Onload conflicting tunnels")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mlxsw driver currently doesn't offload GRE tunnels if they have the
same local address and use the same underlay VRF. When such a situation
arises, the tunnels in conflict are demoted to slow path.
However, the current code only verifies this condition on tunnel
creation and tunnel change, not when a tunnel is moved to a different
VRF. When the tunnel has no bound device, underlay and overlay are the
same. Thus moving a tunnel moves the underlay as well, and that can
cause local address conflict.
So modify mlxsw_sp_netdevice_ipip_ol_vrf_event() to check if there are
any conflicting tunnels, and demote them if yes.
Fixes: af641713e9 ("mlxsw: spectrum_router: Onload conflicting tunnels")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a new local route is added, an IPIP entry is looked up to determine
whether the route should be offloaded as a tunnel decap or as a trap.
That decision should take into account whether the tunnel netdevice in
question is actually IFF_UP, and only install a decap offload if it is.
Fixes: 0063587d35 ("mlxsw: spectrum: Support decap-only IP-in-IP tunnels")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2017-11-27
This series contains updates to e1000, e1000e and i40e.
Gustavo A. R. Silva fixes a sizeof() issue where we were taking the size of
the pointer (which is always the size of the pointer).
Sasha does a follow up fix to a previous fix for buffer overrun, to resolve
community feedback from David Laight and the use of magic numbers.
Amritha fixes the reporting of error codes for when adding a cloud filter
fails.
Ahmad Fatoum brushes the dust off the e1000 driver to fix a code comment
and debug message which was incorrect about what the code was really doing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding cloud filters could fail for a number of reasons,
unsupported filter fields for example, which fails during
validation of fields itself. This will not result in admin
command errors and converting the admin queue status to posix
error code using i40e_aq_rc_to_posix would result in incorrect
error values. If the failure was due to AQ error itself,
reporting that correctly is handled in the inner function.
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is a follow on to commit b10effb92e ("fix buffer overrun while the
I219 is processing DMA transactions") to address David Laights concerns
about the use of "magic" numbers. So define masks as well as add
additional code comments to give a better understanding of what needs to
be done to avoid a buffer overrun.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Reviewed-by: Alexander H Duyck <alexander.h.duyck@intel.com>
Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Reviewed-by: Raanan Avargil <raanan.avargil@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
sizeof when applied to a pointer typed expression gives the size of
the pointer.
The proper fix in this particular case is to code sizeof(*vfres)
instead of sizeof(vfres).
This issue was detected with the help of Coccinelle.
Signed-off-by: Gustavo A R Silva <garsilva@embeddedor.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
v2:
* Replace busy wait with wait_event()/wake_up_all()
* Cannot garantee that at the time xennet_remove is called, the
xen_netback state will not be XenbusStateClosed, so added a
condition for that
* There's a small chance for the xen_netback state is
XenbusStateUnknown by the time the xen_netfront switches to Closed,
so added a condition for that.
When unloading module xen_netfront from guest, dmesg would output
warning messages like below:
[ 105.236836] xen:grant_table: WARNING: g.e. 0x903 still in use!
[ 105.236839] deferring g.e. 0x903 (pfn 0x35805)
This problem relies on netfront and netback being out of sync. By the time
netfront revokes the g.e.'s netback didn't have enough time to free all of
them, hence displaying the warnings on dmesg.
The trick here is to make netfront to wait until netback frees all the g.e.'s
and only then continue to cleanup for the module removal, and this is done by
manipulating both device states.
Signed-off-by: Eduardo Otubo <otubo@redhat.com>
Acked-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* CRYPTO_SHA256 is needed for regdb validation
* mac80211: mesh path metric was wrong in some frames
* mac80211: use QoS null-data packets on QoS connections
* mac80211: tear down RX aggregation sessions first to
drop fewer packets in HW restart scenarios
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlob6QMACgkQB8qZga/f
l8SAvw/9E4fttLsdBhUT1gUJPoeNWsSXyhdoo0QWHs4Ni//bayNUL/Wx8QL6MGf/
nprgxDygqFXH5iUO2gwpaMl2xeO1N7Px7bpKV/KvrDX/UAaEDyrkg6B87Omx95vT
i584pU0LwLVL40CT+saKpF+kChe3S/rMPrAQR0xBNPntyIlo9w0vdOkdTC+vw0br
/Nl3YJxpAmOmTDa8tiKLWHg4hSIS3Q7stylUjnEKNev4nBjBaPqfBW6sy/3o4NiR
MfSCNvGeipr0QFZT31cHDry1fRZ/RvD23KSYNMggzctpYf6IRuHSEhIS3mW7GLpd
GKZyLa7G7mZJAKXJAXOxb2PBBrXl/nTqSHJfme0VroGsYdqLHThXd47/0qaAB4Oo
Io84WSmQX41kRyy4KNU7z0OFR9RgLbctdW/hwGeODxKEagGNNjuz+lbzql7jtYLn
8UZK3LZHYC+qribCHR4Uf6A5rJP7bAcl36Ub/8l7AXNVesLagiaE0FGGrewGnE6X
6qQLoYznd4/bjLQLVou56R9/t+o1nUuvmcGMkRxAlhvX3hw5nSUxgkWS/TgCdrBg
cKddmM78AfEakjd6e21bH2QI2tG6fXS913lwIOq9JU870yVlNpUEr98t5nwYNP3P
l5kzxr237Yj4W9LMUmeZIWo/laiAtgYEeQZAynAZWNxXNBdOvaA=
=ODX1
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-davem-2017-11-27' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Four fixes:
* CRYPTO_SHA256 is needed for regdb validation
* mac80211: mesh path metric was wrong in some frames
* mac80211: use QoS null-data packets on QoS connections
* mac80211: tear down RX aggregation sessions first to
drop fewer packets in HW restart scenarios
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When connected to a QoS/WMM AP, mac80211 should use a QoS NDP
for probing it, instead of a regular non-QoS one, fix this.
Change all the drivers to *not* allow QoS NDP for now, even
though it looks like most of them should be OK with that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Pull timer updates from Thomas Gleixner:
- The final conversion of timer wheel timers to timer_setup().
A few manual conversions and a large coccinelle assisted sweep and
the removal of the old initialization mechanisms and the related
code.
- Remove the now unused VSYSCALL update code
- Fix permissions of /proc/timer_list. I still need to get rid of that
file completely
- Rename a misnomed clocksource function and remove a stale declaration
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
m68k/macboing: Fix missed timer callback assignment
treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts
timer: Remove redundant __setup_timer*() macros
timer: Pass function down to initialization routines
timer: Remove unused data arguments from macros
timer: Switch callback prototype to take struct timer_list * argument
timer: Pass timer_list pointer to callbacks unconditionally
Coccinelle: Remove setup_timer.cocci
timer: Remove setup_*timer() interface
timer: Remove init_timer() interface
treewide: setup_timer() -> timer_setup() (2 field)
treewide: setup_timer() -> timer_setup()
treewide: init_timer() -> setup_timer()
treewide: Switch DEFINE_TIMER callbacks to struct timer_list *
s390: cmm: Convert timers to use timer_setup()
lightnvm: Convert timers to use timer_setup()
drivers/net: cris: Convert timers to use timer_setup()
drm/vc4: Convert timers to use timer_setup()
block/laptop_mode: Convert timers to use timer_setup()
net/atm/mpc: Avoid open-coded assignment of timer callback function
...
Commit 86dabda426 ("net: thunderbolt: Clear finished Tx frame bus
address in tbnet_tx_callback()") fixed a DMA-API violation where the
driver called dma_unmap_page() in tbnet_free_buffers() for a bus address
that might already be unmapped. The fix was to zero out the bus address
of a frame in tbnet_tx_callback().
However, as pointed out by David Miller, zero might well be valid
mapping (at least in theory) so it is not good idea to use it here.
It turns out that we don't need the whole map/unmap dance for Tx buffers
at all. Instead we can map the buffers when they are initially allocated
and unmap them when the interface is brought down. In between we just
DMA sync the buffers for the CPU or device as needed.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't offload IP header checksum to NIC.
This fixes a previous patch which enabled checksum offloading
for both IPv4 and IPv6 packets. So L3 checksum offload was
getting enabled for IPv6 pkts. And HW is dropping these pkts
as it assumes the pkt is IPv4 when IP csum offload is set
in the SQ descriptor.
Fixes: 3a9024f52c ("net: thunderx: Enable TSO and checksum offloads for ipv6")
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function pci_unmap_page is obsolete. So it is replaced with
the function dma_unmap_page.
CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Joe Jin <joe.jin@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the function ipvlan_get_L3_hdr, current codes use pskb_may_pull to
make sure the skb header has enough linear room for ipv6 header. But it
would use the latter memory directly without linear check when it is icmp.
So it still may access the unepxected memory in ipvlan_addr_lookup.
Now invoke the pskb_may_pull again if it is ipv6 icmp.
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the function ipvlan_get_L3_hdr, current codes use pskb_may_pull to
make sure the skb header has enough linear room for arp header. But it
would access the arp payload in func ipvlan_addr_lookup. So it still may
access the unepxected memory.
Now use arp_hdr_len(port->dev) instead of the arp header as the param.
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano pointed that configure or show UDP_ZERO_CSUM6_RX/TX info doesn't
make sense if we haven't enabled CONFIG_IPV6. Fix it by adding
if IS_ENABLED(CONFIG_IPV6) check.
Fixes: abe492b4f5 ("geneve: UDP checksum configuration via netlink")
Fixes: fd7eafd021 ("geneve: fix fill_info when link down")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
First set of fixes for 4.15. Most important here is the iwlwifi fix
for scan command firmware interface change.
ath10k
* fix CCMP-256, GCMP and GCMP-256 in raw mode, it was never working
wcn36xx
* fix device tree node search
iwlwifi
* fix a regression with firmware API change of scan cmd (introduced in
firmware version 34)
* add a bunch of PCI IDs and fix configuration structs for A000 devices
* fix the exported firmware name strings for 9000 and A000 devices
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJaFZQwAAoJEG4XJFUm622bOe8H/0dmodmL+nf0MUKYyXmV2l5J
3wN0Xrse1NL3bRDTGZ/KkX/OFB1FSZ8ViGW0ek9DvbbF4lC9E5WqjM5HbTY7QE6i
bBgF476qnV9Hy/ay9SEMe7HhxsU8KfVKuYgNzY8pHhlCLkr96h35YNt+LtD319zT
PTBrxURLkZotr44mDQjO8duyKZFCafWN6F5nd22+JWpOEQDK4rUNNdDT/16CwQiy
aLqCF5bHB4kOigd9YfEpx1jO6GusqGqkB0x3lwG0QR7oygFgLZbWre0TllE+ly8P
x7gy40euxBWh9Hlk+4xjxBfZraE41XzXTTHyJECVRLKJcrlEvzCc+QCOyoYjZpY=
=uYE+
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-for-davem-2017-11-22' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.15
First set of fixes for 4.15. Most important here is the iwlwifi fix
for scan command firmware interface change.
ath10k
* fix CCMP-256, GCMP and GCMP-256 in raw mode, it was never working
wcn36xx
* fix device tree node search
iwlwifi
* fix a regression with firmware API change of scan cmd (introduced in
firmware version 34)
* add a bunch of PCI IDs and fix configuration structs for A000 devices
* fix the exported firmware name strings for 9000 and A000 devices
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Fixes 2017-11-21
This series contains fixes for igb/vf, ixgbe/vf, i40e/vf and fm10k.
Jake fixes a regression issue with older firmware, where we were using
the NVM lock to synchronize NVM reads for all devices and firmware
versions, yet this caused issues with older firmware prior to version
1.5. Fixed this by only grabbing the lock for newer devices and firmware
version 1.5 or newer.
Zijie Pan fixes the calculation of the i40e VF MAC addresses, where it was
possible to increment to the next MAC entry without calling
i40e_add_mac_filter().
Amritha removes the upper limit of 64 queues on a channel VSI since the
upper bound is determined by the VSI's num_queue_pairs.
Filip fixes an issue during FLR resets, where should have been checking
for upcoming core reset and if so, just return with I40E_ERR_NOT_READY.
Alan fixes the notifying clients of l2 parameters by copying the
parameters to the client instance struct and re-organizes the priority
in which the client tasks fire so that if the flag for notifying l2
params is set, it will trigger before the client open task. Also fixed
the promiscuous settings after reset for all the VSI's.
Brian King from IBM fixes an issue seen on Power systems which would
result in skb list corruption and eventual kernel oops. Brian
provides the same fix for nearly all our drivers, to replace the
read_barrier_depends with smp_rmb() to ensure loads are ordered with
respect to the load of tx_buffer->next_to_watch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The PHY on BCM7278 has an additional bit that needs to be cleared:
IDDQ_GLOBAL_PWR, without doing this, the PHY remains stuck in reset out
of suspend/resume cycles.
Fixes: 0fe9933804 ("net: dsa: bcm_sf2: Add support for BCM7278 integrated switch")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2017-11-23
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Several BPF offloading fixes, from Jakub. Among others:
- Limit offload to cls_bpf and XDP program types only.
- Move device validation into the driver and don't make
any assumptions about the device in the classifier due
to shared blocks semantics.
- Don't pass offloaded XDP program into the driver when
it should be run in native XDP instead. Offloaded ones
are not JITed for the host in such cases.
- Don't destroy device offload state when moved to
another namespace.
- Revert dumping offload info into user space for now,
since ifindex alone is not sufficient. This will be
redone properly for bpf-next tree.
2) Fix test_verifier to avoid using bpf_probe_write_user()
helper in test cases, since it's dumping a warning into
kernel log which may confuse users when only running tests.
Switch to use bpf_trace_printk() instead, from Yonghong.
3) Several fixes for correcting ARG_CONST_SIZE_OR_ZERO semantics
before it becomes uabi, from Gianluca. More specifically:
- Add a type ARG_PTR_TO_MEM_OR_NULL that is used only
by bpf_csum_diff(), where the argument is either a
valid pointer or NULL. The subsequent ARG_CONST_SIZE_OR_ZERO
then enforces a valid pointer in case of non-0 size
or a valid pointer or NULL in case of size 0. Given
that, the semantics for ARG_PTR_TO_MEM in combination
with ARG_CONST_SIZE_OR_ZERO are now such that in case
of size 0, the pointer must always be valid and cannot
be NULL. This fix in semantics allows for bpf_probe_read()
to drop the recently added size == 0 check in the helper
that would become part of uabi otherwise once released.
At the same time we can then fix bpf_probe_read_str() and
bpf_perf_event_output() to use ARG_CONST_SIZE_OR_ZERO
instead of ARG_CONST_SIZE in order to fix recently
reported issues by Arnaldo et al, where LLVM optimizes
two boundary checks into a single one for unknown
variables where the verifier looses track of the variable
bounds and thus rejects valid programs otherwise.
4) A fix for the verifier for the case when it detects
comparison of two constants where the branch is guaranteed
to not be taken at runtime. Verifier will rightfully prune
the exploration of such paths, but we still pass the program
to JITs, where they would complain about using reserved
fields, etc. Track such dead instructions and sanitize
them with mov r0,r0. Rejection is not possible since LLVM
may generate them for valid C code and doesn't do as much
data flow analysis as verifier. For bpf-next we might
implement removal of such dead code and adjust branches
instead. Fix from Alexei.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tuntap and similar devices can inject GSO packets. Accept type
VIRTIO_NET_HDR_GSO_UDP, even though not generating UFO natively.
Processes are expected to use feature negotiation such as TUNSETOFFLOAD
to detect supported offload types and refrain from injecting other
packets. This process breaks down with live migration: guest kernels
do not renegotiate flags, so destination hosts need to expose all
features that the source host does.
Partially revert the UFO removal from 182e0b6b5846~1..d9d30adf5677.
This patch introduces nearly(*) no new code to simplify verification.
It brings back verbatim tuntap UFO negotiation, VIRTIO_NET_HDR_GSO_UDP
insertion and software UFO segmentation.
It does not reinstate protocol stack support, hardware offload
(NETIF_F_UFO), SKB_GSO_UDP tunneling in SKB_GSO_SOFTWARE or reception
of VIRTIO_NET_HDR_GSO_UDP packets in tuntap.
To support SKB_GSO_UDP reappearing in the stack, also reinstate
logic in act_csum and openvswitch. Achieve equivalence with v4.13 HEAD
by squashing in commit 939912216f ("net: skb_needs_check() removes
CHECKSUM_UNNECESSARY check for tx.") and reverting commit 8d63bee643
("net: avoid skb_warn_bad_offload false positives on UFO").
(*) To avoid having to bring back skb_shinfo(skb)->ip6_frag_id,
ipv6_proxy_select_ident is changed to return a __be32 and this is
assigned directly to the frag_hdr. Also, SKB_GSO_UDP is inserted
at the end of the enum to minimize code churn.
Tested
Booted a v4.13 guest kernel with QEMU. On a host kernel before this
patch `ethtool -k eth0` shows UFO disabled. After the patch, it is
enabled, same as on a v4.13 host kernel.
A UFO packet sent from the guest appears on the tap device:
host:
nc -l -p -u 8000 &
tcpdump -n -i tap0
guest:
dd if=/dev/zero of=payload.txt bs=1 count=2000
nc -u 192.16.1.1 8000 < payload.txt
Direct tap to tap transmission of VIRTIO_NET_HDR_GSO_UDP succeeds,
packets arriving fragmented:
./with_tap_pair.sh ./tap_send_ufo tap0 tap1
(from https://github.com/wdebruij/kerneltools/tree/master/tests)
Changes
v1 -> v2
- simplified set_offload change (review comment)
- documented test procedure
Link: http://lkml.kernel.org/r/<CAF=yD-LuUeDuL9YWPJD9ykOZ0QCjNeznPDr6whqZ9NGMNF12Mw@mail.gmail.com>
Fixes: fb652fdfe8 ("macvlan/macvtap: Remove NETIF_F_UFO advertisement.")
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 6fa1ba6152 partially
implemented the new ethtool API, by replacing get_settings()
with get_link_ksettings(). This breaks ethtool, since the
userspace tool (according to the new API specs) never tries
the legacy set() call, when the new get() call succeeds.
All attempts to chance some setting from userspace result in:
> Cannot set new settings: Operation not supported
Implement the missing set() call.
Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change resolves a new compile-time warning
when built as a loadable module:
WARNING: modpost: missing MODULE_LICENSE() in drivers/net/phy/cortina.o
see include/linux/module.h for more information
This adds the license as "GPL", which matches the header of the file.
MODULE_DESCRIPTION and MODULE_AUTHOR are also added.
Signed-off-by: Jesse Chan <jc@linux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The original issue being fixed in this patch was seen with the ixgbe
driver, but the same issue exists with i40evf as well, as the code is
very similar. read_barrier_depends is not sufficient to ensure
loads following it are not speculatively loaded out of order
by the CPU, which can result in stale data being loaded, causing
potential system crashes.
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The original issue being fixed in this patch was seen with the ixgbe
driver, but the same issue exists with fm10k as well, as the code is
very similar. read_barrier_depends is not sufficient to ensure
loads following it are not speculatively loaded out of order
by the CPU, which can result in stale data being loaded, causing
potential system crashes.
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The original issue being fixed in this patch was seen with the ixgbe
driver, but the same issue exists with igb as well, as the code is
very similar. read_barrier_depends is not sufficient to ensure
loads following it are not speculatively loaded out of order
by the CPU, which can result in stale data being loaded, causing
potential system crashes.
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The original issue being fixed in this patch was seen with the ixgbe
driver, but the same issue exists with igbvf as well, as the code is
very similar. read_barrier_depends is not sufficient to ensure
loads following it are not speculatively loaded out of order
by the CPU, which can result in stale data being loaded, causing
potential system crashes.
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The original issue being fixed in this patch was seen with the ixgbe
driver, but the same issue exists with ixgbevf as well, as the code is
very similar. read_barrier_depends is not sufficient to ensure
loads following it are not speculatively loaded out of order
by the CPU, which can result in stale data being loaded, causing
potential system crashes.
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The original issue being fixed in this patch was seen with the ixgbe
driver, but the same issue exists with i40e as well, as the code is
very similar. read_barrier_depends is not sufficient to ensure
loads following it are not speculatively loaded out of order
by the CPU, which can result in stale data being loaded, causing
potential system crashes.
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes an issue seen on Power systems with ixgbe which results
in skb list corruption and an eventual kernel oops. The following is what
was observed:
CPU 1 CPU2
============================ ============================
1: ixgbe_xmit_frame_ring ixgbe_clean_tx_irq
2: first->skb = skb eop_desc = tx_buffer->next_to_watch
3: ixgbe_tx_map read_barrier_depends()
4: wmb check adapter written status bit
5: first->next_to_watch = tx_desc napi_consume_skb(tx_buffer->skb ..);
6: writel(i, tx_ring->tail);
The read_barrier_depends is insufficient to ensure that tx_buffer->skb does not
get loaded prior to tx_buffer->next_to_watch, which then results in loading
a stale skb pointer. This patch replaces the read_barrier_depends with
smp_rmb to ensure loads are ordered with respect to the load of
tx_buffer->next_to_watch.
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
After a reset we rebuild the VSIs which is going to clobber any
promiscuous settings we had before reset. This makes it so that we
restore the promiscuous settings we had before reset.
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The current method for notifying clients of l2 parameters is broken
because we fail to copy the new parameters to the client instance
struct, we need to do the notification before the client 'open' function
pointer gets called, and lastly we should set the l2 parameters when
first adding a client instance.
This patch first introduces the i40evf_client_get_params function to
prevent code duplication in the i40evf_client_add_instance and the
i40evf_notify_client_l2_params functions. We then fix the notify l2
params function to actually copy the parameters to client instance
struct and do the same in the *_add_instance' function. Lastly this
patch reorganizes the priority in which client tasks fire so that if the
flag for notifying l2 params is set, it will trigger before the open
because the client needs these new parameters as part of a client open
task.
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch allows detection of upcoming core reset in case NIC gets
stuck while performing FLR reset. The i40e_pf_reset() function returns
I40E_ERR_NOT_READY when global reset was detected.
Signed-off-by: Filip Sadowski <filip.sadowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It is safe to remove the upper limit of 64 queues on a channel
VSI. The upper bound is determined by the VSI's num_queue_pairs
and gets validated when the queue mapping info through mqprio
interface is subject to bound checking in the driver.
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>