Pull networking fixes from David Miller:
1) Fix regression in SKB partial checksum handling, from Pravin B
Shalar.
2) Fix VLAN inside of VXLAN handling in i40e driver, from Jesse
Brandeburg.
3) Cure softlockups during accept() in SCTP, from Karl Heiss.
4) MSG_PEEK should return multiple SKBs worth of data in AF_UNIX, from
Aaron Conole.
5) IPV6 erroneously ignores output interface specifier in lookup key for
route lookups, fix from David Ahern.
6) In Marvell DSA driver, forward unknown frames to CPU port, from
Andrew Lunn.
7) Mission flow flag initializations in some code paths, from David
Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: Initialize flow flags in input path
net: dsa: fix preparation of a port STP update
testptp: Silence compiler warnings on ppc64
net/mlx4: Handle return codes in mlx4_qp_attach_common
dsa: mv88e6xxx: Enable forwarding for unknown to the CPU port
skbuff: Fix skb checksum partial check.
net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set
net sysfs: Print link speed as signed integer
bna: fix error handling
af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag
af_unix: Convert the unix_sk macro to an inline function for type safety
net: sctp: Don't use 64 kilobyte lookup table for four elements
l2tp: protect tunnel->del_work by ref_count
net/ibm/emac: bump version numbers for correct work with ethtool
sctp: Prevent soft lockup when sctp_accept() is called during a timeout event
sctp: Whitespace fix
i40e/i40evf: check for stopped admin queue
i40e: fix VLAN inside VXLAN
r8169: fix handling rtl_readphy result
net: hisilicon: fix handling platform_get_irq result
- Fixes for mlx5 related issues
- Fixes for ipoib multicast handling
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWCfALAAoJELgmozMOVy/dc+MQAKoD6echYpTkWE0otMuHQcYf
zMaVVots+JdRKpA6OqHYQHgKGA80z21BpnjGYwcwB5zB1zPrJwz4vxwGlOBHt01T
xLBReFgSKyJlgOWLXKfPx4bXUdivOBKm203wY0dh+/dC/VROGYoiXYTmSDsfsuKa
8OXT1kWgzRVLtqwqj5GSkgWvtFZ28CjKh6d9egjqcj9tpbh2UupQDZzMyOtZ52X6
Nz/Vo3u4T7qjzlhHOlCwHCDw+97x0yvmvLY1mWweGPfKOnxtXjkzQmTQEpyzU5Mo
EwcqJucrBnmjbLAIBMrbR1mzTUQeD4dHz1jx+EzWE0lVnRL3twe1UaY40176sNlm
aCBA4bIOQ242r3IJ++ss15ol1k5hu7PYKRn9Q8d2sSbQGcSnCHe/YOutQQ+FTEFG
yE9xiLL+pgT8koauROnxg66E3HDM78NGTpjP3EuG4r2Qwa1iFANPfDB6kikuv8bO
rG3qUJcloEPvfatZY+h5QC4UCoB0/W1DAhlfzE3tPBYPmhSEgQDfEOzXTKDakeF0
VB903bYrOL3CVOun4I7fLrDc1leVeiAUKqO2orZs3qIpRWvAKyV/VjolAusMv2+F
/4xPyh95AEMTFfmZogOCofQFk3eOnkWpLdrVTYCKy3i6NVBoy2wHldrl+LuCAN/m
r/DNRBmazShashbeU6wg
=8+cX
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
- Fixes for mlx5 related issues
- Fixes for ipoib multicast handling
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/ipoib: increase the max mcast backlog queue
IB/ipoib: Make sendonly multicast joins create the mcast group
IB/ipoib: Expire sendonly multicast joins
IB/mlx5: Remove pa_lkey usages
IB/mlx5: Remove support for IB_DEVICE_LOCAL_DMA_LKEY
IB/iser: Add module parameter for always register memory
xprtrdma: Replace global lkey with lkey local to PD
Both new_steering_entry() and existing_steering_entry() return values
based on their success or failure, but currently they fall through
silently. This can make troubleshooting difficult, as we were unable
to tell which one of these two functions returned errors or
specifically what code was returned. This patch remedies that
situation by passing the return codes to err, which is returned by
mlx4_qp_attach_common() itself.
This also addresses a leak in the call to mlx4_bitmap_free() as well.
Signed-off-by: Robb Manes <rmanes@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several functions can return negative value in case of error,
so their return type should be fixed as well as type of variables
to which this value is assigned.
The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1].
[1]: http://permalink.gmane.org/gmane.linux.kernel/2046107
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The size of the MAC register dump used to be the size specified by the
reg property in the device tree. Userland has no good way of finding
out that size, and it was not specified consistently for each MAC type,
so ethtool would end up printing junk at the end of the register dump
if the device tree didn't match the size it assumed.
Using the new version numbers indicates unambiguously that the size of
the MAC register dump is dependent only on the MAC type.
Fixes: 5369c71f7c ("net/ibm/emac: fix size of emac dump memory areas")
Signed-off-by: Ivan Mikhaylov <ivan@ru.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's possible that while we are waiting for the spinlock, another
entity (that owns the spinlock) has shut down the admin queue.
If we then attempt to use the queue, we will panic.
Add a check for this condition on the receive side. This matches
an existing check on the send queue side.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously to this patch, the hardware was removing
VLAN tags from the inner header of VXLAN packets. The
hardware configuration can be changed to leave the
packet alone since that is what the linux stack
expects for this type of VLAN in VXLAN packet.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function can return negative value.
The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1].
[1]: http://permalink.gmane.org/gmane.linux.kernel/2046107
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function can return negative value.
The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1].
[1]: http://permalink.gmane.org/gmane.linux.kernel/2046107
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) When we run a tap on netlink sockets, we have to copy mmap'd SKBs
instead of cloning them. From Daniel Borkmann.
2) When converting classical BPF into eBPF, fix the setting of the
source reg to BPF_REG_X. From Tycho Andersen.
3) Fix igmpv3/mldv2 report parsing in the bridge multicast code, from
Linus Lussing.
4) Fix dst refcounting for ipv6 tunnels, from Martin KaFai Lau.
5) Set NLM_F_REPLACE flag properly when replacing ipv6 routes, from
Roopa Prabhu.
6) Add some new cxgb4 PCI device IDs, from Hariprasad Shenai.
7) Fix headroom tests and SKB leaks in ipv6 fragmentation code, from
Florian Westphal.
8) Check DMA mapping errors in bna driver, from Ivan Vecera.
9) Several 8139cp bug fixes (dev_kfree_skb_any in interrupt context,
misclearing of interrupt status in TX timeout handler, etc.) from
David Woodhouse.
10) In tipc, reset SKB header pointer after skb_linearize(), from Erik
Hugne.
11) Fix autobind races et al. in netlink code, from Herbert Xu with
help from Tejun Heo and others.
12) Missing SET_NETDEV_DEV in sunvnet driver, from Sowmini Varadhan.
13) Fix various races in timewait timer and reqsk_queue_hadh_req, from
Eric Dumazet.
14) Fix array overruns in mac80211, from Johannes Berg and Dan
Carpenter.
15) Fix data race in rhashtable_rehash_one(), from Dmitriy Vyukov.
16) Fix race between poll_one_napi and napi_disable, from Neil Horman.
17) Fix byte order in geneve tunnel port config, from John W Linville.
18) Fix handling of ARP replies over lightweight tunnels, from Jiri
Benc.
19) We can loop when fib rule dumps cross multiple SKBs, fix from Wilson
Kok and Roopa Prabhu.
20) Several reference count handling bug fixes in the PHY/MDIO layer
from Russel King.
21) Fix lockdep splat in ppp_dev_uninit(), from Guillaume Nault.
22) Fix crash in icmp_route_lookup(), from David Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
net: Fix panic in icmp_route_lookup
net: update docbook comment for __mdiobus_register()
ppp: fix lockdep splat in ppp_dev_uninit()
net: via/Kconfig: GENERIC_PCI_IOMAP required if PCI not selected
phy: marvell: add link partner advertised modes
net: fix net_device refcounting
phy: add phy_device_remove()
phy: fixed-phy: properly validate phy in fixed_phy_update_state()
net: fix phy refcounting in a bunch of drivers
of_mdio: fix MDIO phy device refcounting
phy: add proper phy struct device refcounting
phy: fix mdiobus module safety
net: dsa: fix of_mdio_find_bus() device refcount leak
phy: fix of_mdio_find_bus() device refcount leak
ip6_tunnel: Reduce log level in ip6_tnl_err() to debug
ip6_gre: Reduce log level in ip6gre_err() to debug
fib_rules: fix fib rule dumps across multiple skbs
bnx2x: byte swap rss_key to comply to Toeplitz specs
net: revert "net_sched: move tp->root allocation into fw_init()"
lwtunnel: remove source and destination UDP port config option
...
The builds of allmodconfig of avr32 is failing with:
drivers/net/ethernet/via/via-rhine.c:1098:2: error: implicit declaration
of function 'pci_iomap' [-Werror=implicit-function-declaration]
drivers/net/ethernet/via/via-rhine.c:1119:2: error: implicit declaration
of function 'pci_iounmap' [-Werror=implicit-function-declaration]
The generic empty pci_iomap and pci_iounmap is used only if CONFIG_PCI
is not defined and CONFIG_GENERIC_PCI_IOMAP is defined.
Add GENERIC_PCI_IOMAP in the dependency list for VIA_RHINE as we are
getting build failure when CONFIG_PCI and CONFIG_GENERIC_PCI_IOMAP both
are not defined.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 96249d70dd ("IB/core: Guarantee that a local_dma_lkey
is available") allows ULPs that make use of the local dma key to keep
working as before by allocating a DMA MR with local permissions and
converted these consumers to use the MR associated with the PD
rather then device->local_dma_lkey.
ConnectIB has some known issues with memory registration
using the local_dma_lkey (SEND, RDMA, RECV seems to work ok).
Thus don't expose support for it (remove device->local_dma_lkey
setting), and take advantage of the above commit such that no regression
is introduced to working systems.
The local_dma_lkey support will be restored in CX4 depending on FW
capability query.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add a phy_device_remove() function to complement phy_device_register(),
which undoes the effects of phy_device_register() by removing the phy
device from visibility, but not freeing it.
This allows these details to be moved out of the mdio bus code into
the phy code where this action belongs.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_phy_find_device() increments the phy struct device refcount, which
we need to properly balance. Add code to network drivers using this
function to ensure that the struct device refcount is correctly
balanced.
For xgene, looking back in the history, we should be able to use
of_phy_connect() with a zero flags argument for the DT case as this is
how the driver used to operate prior to de7b5b3d79 ("net: eth: xgene:
change APM X-Gene SoC platform ethernet to support ACPI").
This leaves the Cavium Thunder BGX unfixed; fixing this driver is a
complicated task, one which the maintainers need to be involved with.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
After a good amount of debugging, I found bnx2x was byte swaping
the 40 bytes of rss_key.
If we byte swap the key, then bnx2x generates hashes matching
MSDN specs as documented in (Verifying the RSS Hash Calculation)
https://msdn.microsoft.com/en-us/library/windows/hardware/ff571021%
28v=vs.85%29.aspx
It is mostly a non issue, unless we want to mix different NIC
in a host, and want consistent hashing among all of them, ie
if they all use the boot time generated rss key, or if some application
is choosing specific tuple(s) so that incoming traffic lands into known
rx queue(s).
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device is set as wakeup capable using proper wakeup API but the
driver misuses IRQF_NO_SUSPEND to set the interrupt as wakeup source
which is incorrect.
This patch removes the use of IRQF_NO_SUSPEND flags replacing it with
enable_irq_wake instead.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Claudiu Manoil <claudiu.manoil@freescale.com>
Cc: Kevin Hao <haokexin@gmail.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are seeing unexplained TX timeouts under heavy load. Let's try to get
a better idea of what's going on.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The low 16 bits of the 'opts1' field in the TX descriptor are supposed
to still contain the buffer length when the descriptor is handed back to
us. In practice, at least on my hardware, they don't. So stash the
original value of the opts1 field and get the length to unmap from
there.
There are other ways we could have worked out the length, but I actually
want a stash of the opts1 field anyway so that I can dump it alongside
the contents of the descriptor ring when we suffer a TX timeout.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We calculate the value of the opts1 descriptor field in three different
places. With two different behaviours when given an invalid packet to
be checksummed — none of them correct. Sort that out.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sending a TSO frame in multiple buffers, we were neglecting to set
the first descriptor up in TSO mode.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After a certain amount of staring at the debug output of this driver, I
realised it was lying to me.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If an RX interrupt was already received but NAPI has not yet run when
the RX timeout happens, we end up in cp_tx_timeout() with RX interrupts
already disabled. Blindly re-enabling them will cause an IRQ storm.
(This is made particularly horrid by the fact that cp_interrupt() always
returns that it's handled the interrupt, even when it hasn't actually
done anything. If it didn't do that, the core IRQ code would have
detected the storm and handled it, I'd have had a clear smoking gun
backtrace instead of just a spontaneously resetting router, and I'd have
at *least* two days of my life back. Changing the return value of
cp_interrupt() will be argued about under separate cover.)
Unconditionally leave RX interrupts disabled after the reset, and
schedule NAPI to check the receive ring and re-enable them.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A deadlock trace is seen in netcp driver with lockup detector enabled.
The trace log is provided below for reference. This patch fixes the
bug by removing the usage of netcp_modules_lock within ndo_ops functions.
ndo_{open/close/ioctl)() is already called with rtnl_lock held. So there
is no need to hold another mutex for serialization across processes on
multiple cores. So remove use of netcp_modules_lock mutex from these
ndo ops functions.
ndo_set_rx_mode() shouldn't be using a mutex as it is called from atomic
context. In the case of ndo_set_rx_mode(), there can be call to this API
without rtnl_lock held from an atomic context. As the underlying modules
are expected to add address to a hardware table, it is to be protected
across concurrent updates and hence a spin lock is used to synchronize
the access. Same with ndo_vlan_rx_add_vid() & ndo_vlan_rx_kill_vid().
Probably the netcp_modules_lock is used to protect the module not being
removed as part of rmmod. Currently this is not fully implemented and
assumes the interface is brought down before doing rmmod of modules.
The support for rmmmod while interface is up is expected in a future
patch set when additional modules such as pa, qos are added. For now
all of the tests such as if up/down, reboot, iperf works fine with this
patch applied.
Deadlock trace seen with lockup detector enabled is shown below for
reference.
[ 16.863014] ======================================================
[ 16.869183] [ INFO: possible circular locking dependency detected ]
[ 16.875441] 4.1.6-01265-gfb1e101 #1 Tainted: G W
[ 16.881176] -------------------------------------------------------
[ 16.887432] ifconfig/1662 is trying to acquire lock:
[ 16.892386] (netcp_modules_lock){+.+.+.}, at: [<c03e8110>]
netcp_ndo_open+0x168/0x518
[ 16.900321]
[ 16.900321] but task is already holding lock:
[ 16.906144] (rtnl_mutex){+.+.+.}, at: [<c053a418>] devinet_ioctl+0xf8/0x7e4
[ 16.913206]
[ 16.913206] which lock already depends on the new lock.
[ 16.913206]
[ 16.921372]
[ 16.921372] the existing dependency chain (in reverse order) is:
[ 16.928844]
-> #1 (rtnl_mutex){+.+.+.}:
[ 16.932865] [<c06023f0>] mutex_lock_nested+0x68/0x4a8
[ 16.938521] [<c04c5758>] register_netdev+0xc/0x24
[ 16.943831] [<c03e65c0>] netcp_module_probe+0x214/0x2ec
[ 16.949660] [<c03e8a54>] netcp_register_module+0xd4/0x140
[ 16.955663] [<c089654c>] keystone_gbe_init+0x10/0x28
[ 16.961233] [<c000977c>] do_one_initcall+0xb8/0x1f8
[ 16.966714] [<c0867e04>] kernel_init_freeable+0x148/0x1e8
[ 16.972720] [<c05f9994>] kernel_init+0xc/0xe8
[ 16.977682] [<c0010038>] ret_from_fork+0x14/0x3c
[ 16.982905]
-> #0 (netcp_modules_lock){+.+.+.}:
[ 16.987619] [<c006eab0>] lock_acquire+0x118/0x320
[ 16.992928] [<c06023f0>] mutex_lock_nested+0x68/0x4a8
[ 16.998582] [<c03e8110>] netcp_ndo_open+0x168/0x518
[ 17.004064] [<c04c48f0>] __dev_open+0xa8/0x10c
[ 17.009112] [<c04c4b74>] __dev_change_flags+0x94/0x144
[ 17.014853] [<c04c4c3c>] dev_change_flags+0x18/0x48
[ 17.020334] [<c053a9fc>] devinet_ioctl+0x6dc/0x7e4
[ 17.025729] [<c04a59ec>] sock_ioctl+0x1d0/0x2a8
[ 17.030865] [<c0142844>] do_vfs_ioctl+0x41c/0x688
[ 17.036173] [<c0142ae4>] SyS_ioctl+0x34/0x5c
[ 17.041046] [<c000ff60>] ret_fast_syscall+0x0/0x54
[ 17.046441]
[ 17.046441] other info that might help us debug this:
[ 17.046441]
[ 17.054434] Possible unsafe locking scenario:
[ 17.054434]
[ 17.060343] CPU0 CPU1
[ 17.064862] ---- ----
[ 17.069381] lock(rtnl_mutex);
[ 17.072522] lock(netcp_modules_lock);
[ 17.078875] lock(rtnl_mutex);
[ 17.084532] lock(netcp_modules_lock);
[ 17.088366]
[ 17.088366] *** DEADLOCK ***
[ 17.088366]
[ 17.094279] 1 lock held by ifconfig/1662:
[ 17.098278] #0: (rtnl_mutex){+.+.+.}, at: [<c053a418>]
devinet_ioctl+0xf8/0x7e4
[ 17.105774]
[ 17.105774] stack backtrace:
[ 17.110124] CPU: 1 PID: 1662 Comm: ifconfig Tainted: G W
4.1.6-01265-gfb1e101 #1
[ 17.118637] Hardware name: Keystone
[ 17.122123] [<c00178e4>] (unwind_backtrace) from [<c0013cbc>]
(show_stack+0x10/0x14)
[ 17.129862] [<c0013cbc>] (show_stack) from [<c05ff450>]
(dump_stack+0x84/0xc4)
[ 17.137079] [<c05ff450>] (dump_stack) from [<c0068e34>]
(print_circular_bug+0x210/0x330)
[ 17.145161] [<c0068e34>] (print_circular_bug) from [<c006ab7c>]
(validate_chain.isra.35+0xf98/0x13ac)
[ 17.154372] [<c006ab7c>] (validate_chain.isra.35) from [<c006da60>]
(__lock_acquire+0x52c/0xcc0)
[ 17.163149] [<c006da60>] (__lock_acquire) from [<c006eab0>]
(lock_acquire+0x118/0x320)
[ 17.171058] [<c006eab0>] (lock_acquire) from [<c06023f0>]
(mutex_lock_nested+0x68/0x4a8)
[ 17.179140] [<c06023f0>] (mutex_lock_nested) from [<c03e8110>]
(netcp_ndo_open+0x168/0x518)
[ 17.187484] [<c03e8110>] (netcp_ndo_open) from [<c04c48f0>]
(__dev_open+0xa8/0x10c)
[ 17.195133] [<c04c48f0>] (__dev_open) from [<c04c4b74>]
(__dev_change_flags+0x94/0x144)
[ 17.203129] [<c04c4b74>] (__dev_change_flags) from [<c04c4c3c>]
(dev_change_flags+0x18/0x48)
[ 17.211560] [<c04c4c3c>] (dev_change_flags) from [<c053a9fc>]
(devinet_ioctl+0x6dc/0x7e4)
[ 17.219729] [<c053a9fc>] (devinet_ioctl) from [<c04a59ec>]
(sock_ioctl+0x1d0/0x2a8)
[ 17.227378] [<c04a59ec>] (sock_ioctl) from [<c0142844>]
(do_vfs_ioctl+0x41c/0x688)
[ 17.234939] [<c0142844>] (do_vfs_ioctl) from [<c0142ae4>]
(SyS_ioctl+0x34/0x5c)
[ 17.242242] [<c0142ae4>] (SyS_ioctl) from [<c000ff60>]
(ret_fast_syscall+0x0/0x54)
[ 17.258855] netcp-1.0 2620110.netcp eth0: Link is Up - 1Gbps/Full - flow
control off
[ 17.271282] BUG: sleeping function called from invalid context at
kernel/locking/mutex.c:616
[ 17.279712] in_atomic(): 1, irqs_disabled(): 0, pid: 1662, name: ifconfig
[ 17.286500] INFO: lockdep is turned off.
[ 17.290413] Preemption disabled at:[< (null)>] (null)
[ 17.295728]
[ 17.297214] CPU: 1 PID: 1662 Comm: ifconfig Tainted: G W
4.1.6-01265-gfb1e101 #1
[ 17.305735] Hardware name: Keystone
[ 17.309223] [<c00178e4>] (unwind_backtrace) from [<c0013cbc>]
(show_stack+0x10/0x14)
[ 17.316970] [<c0013cbc>] (show_stack) from [<c05ff450>]
(dump_stack+0x84/0xc4)
[ 17.324194] [<c05ff450>] (dump_stack) from [<c06023b0>]
(mutex_lock_nested+0x28/0x4a8)
[ 17.332112] [<c06023b0>] (mutex_lock_nested) from [<c03e9840>]
(netcp_set_rx_mode+0x160/0x210)
[ 17.340724] [<c03e9840>] (netcp_set_rx_mode) from [<c04c483c>]
(dev_set_rx_mode+0x1c/0x28)
[ 17.348982] [<c04c483c>] (dev_set_rx_mode) from [<c04c490c>]
(__dev_open+0xc4/0x10c)
[ 17.356724] [<c04c490c>] (__dev_open) from [<c04c4b74>]
(__dev_change_flags+0x94/0x144)
[ 17.364729] [<c04c4b74>] (__dev_change_flags) from [<c04c4c3c>]
(dev_change_flags+0x18/0x48)
[ 17.373166] [<c04c4c3c>] (dev_change_flags) from [<c053a9fc>]
(devinet_ioctl+0x6dc/0x7e4)
[ 17.381344] [<c053a9fc>] (devinet_ioctl) from [<c04a59ec>]
(sock_ioctl+0x1d0/0x2a8)
[ 17.388994] [<c04a59ec>] (sock_ioctl) from [<c0142844>]
(do_vfs_ioctl+0x41c/0x688)
[ 17.396563] [<c0142844>] (do_vfs_ioctl) from [<c0142ae4>]
(SyS_ioctl+0x34/0x5c)
[ 17.403873] [<c0142ae4>] (SyS_ioctl) from [<c000ff60>]
(ret_fast_syscall+0x0/0x54)
[ 17.413772] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
udhcpc (v1.20.2) started
Sending discover...
[ 18.690666] netcp-1.0 2620110.netcp eth0: Link is Up - 1Gbps/Full - flow
control off
Sending discover...
[ 22.250972] netcp-1.0 2620110.netcp eth0: Link is Up - 1Gbps/Full - flow
control off
[ 22.258721] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 22.265458] BUG: sleeping function called from invalid context at
kernel/locking/mutex.c:616
[ 22.273896] in_atomic(): 1, irqs_disabled(): 0, pid: 342, name: kworker/1:1
[ 22.280854] INFO: lockdep is turned off.
[ 22.284767] Preemption disabled at:[< (null)>] (null)
[ 22.290074]
[ 22.291568] CPU: 1 PID: 342 Comm: kworker/1:1 Tainted: G W
4.1.6-01265-gfb1e101 #1
[ 22.300255] Hardware name: Keystone
[ 22.303750] Workqueue: ipv6_addrconf addrconf_dad_work
[ 22.308895] [<c00178e4>] (unwind_backtrace) from [<c0013cbc>]
(show_stack+0x10/0x14)
[ 22.316643] [<c0013cbc>] (show_stack) from [<c05ff450>]
(dump_stack+0x84/0xc4)
[ 22.323867] [<c05ff450>] (dump_stack) from [<c06023b0>]
(mutex_lock_nested+0x28/0x4a8)
[ 22.331786] [<c06023b0>] (mutex_lock_nested) from [<c03e9840>]
(netcp_set_rx_mode+0x160/0x210)
[ 22.340394] [<c03e9840>] (netcp_set_rx_mode) from [<c04c9d18>]
(__dev_mc_add+0x54/0x68)
[ 22.348401] [<c04c9d18>] (__dev_mc_add) from [<c05ab358>]
(igmp6_group_added+0x168/0x1b4)
[ 22.356580] [<c05ab358>] (igmp6_group_added) from [<c05ad2cc>]
(ipv6_dev_mc_inc+0x4f0/0x5a8)
[ 22.365019] [<c05ad2cc>] (ipv6_dev_mc_inc) from [<c058f0d0>]
(addrconf_dad_work+0x21c/0x33c)
[ 22.373460] [<c058f0d0>] (addrconf_dad_work) from [<c0042850>]
(process_one_work+0x214/0x8d0)
[ 22.381986] [<c0042850>] (process_one_work) from [<c0042f54>]
(worker_thread+0x48/0x4bc)
[ 22.390071] [<c0042f54>] (worker_thread) from [<c004868c>]
(kthread+0xf0/0x108)
[ 22.397381] [<c004868c>] (kthread) from [<c0010038>]
Trace related to incorrect usage of mutex inside ndo_set_rx_mode
[ 24.086066] BUG: sleeping function called from invalid context at
kernel/locking/mutex.c:616
[ 24.094506] in_atomic(): 1, irqs_disabled(): 0, pid: 1682, name: ifconfig
[ 24.101291] INFO: lockdep is turned off.
[ 24.105203] Preemption disabled at:[< (null)>] (null)
[ 24.110511]
[ 24.112005] CPU: 2 PID: 1682 Comm: ifconfig Tainted: G W
4.1.6-01265-gfb1e101 #1
[ 24.120518] Hardware name: Keystone
[ 24.124018] [<c00178e4>] (unwind_backtrace) from [<c0013cbc>]
(show_stack+0x10/0x14)
[ 24.131772] [<c0013cbc>] (show_stack) from [<c05ff450>]
(dump_stack+0x84/0xc4)
[ 24.138989] [<c05ff450>] (dump_stack) from [<c06023b0>]
(mutex_lock_nested+0x28/0x4a8)
[ 24.146908] [<c06023b0>] (mutex_lock_nested) from [<c03e9840>]
(netcp_set_rx_mode+0x160/0x210)
[ 24.155523] [<c03e9840>] (netcp_set_rx_mode) from [<c04c483c>]
(dev_set_rx_mode+0x1c/0x28)
[ 24.163787] [<c04c483c>] (dev_set_rx_mode) from [<c04c490c>]
(__dev_open+0xc4/0x10c)
[ 24.171531] [<c04c490c>] (__dev_open) from [<c04c4b74>]
(__dev_change_flags+0x94/0x144)
[ 24.179528] [<c04c4b74>] (__dev_change_flags) from [<c04c4c3c>]
(dev_change_flags+0x18/0x48)
[ 24.187966] [<c04c4c3c>] (dev_change_flags) from [<c053a9fc>]
(devinet_ioctl+0x6dc/0x7e4)
[ 24.196145] [<c053a9fc>] (devinet_ioctl) from [<c04a59ec>]
(sock_ioctl+0x1d0/0x2a8)
[ 24.203803] [<c04a59ec>] (sock_ioctl) from [<c0142844>]
(do_vfs_ioctl+0x41c/0x688)
[ 24.211373] [<c0142844>] (do_vfs_ioctl) from [<c0142ae4>]
(SyS_ioctl+0x34/0x5c)
[ 24.218676] [<c0142ae4>] (SyS_ioctl) from [<c000ff60>]
(ret_fast_syscall+0x0/0x54)
[ 24.227156] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently netcp_rxpool_refill() that refill descriptors and attached
buffers to fdq while interrupt is enabled as part of NAPI poll. Doing
it while interrupt is disabled could be beneficial as hardware will
not be starved when CPU is busy with processing interrupt.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently netcp_module_probe() doesn't check the return value of
of_parse_phandle() that points to the interface data for the
module and then pass the node ptr to the module which is incorrect.
Check for return value and free the intf_modpriv if there is error.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, if netcp_allocate_rx_buf() fails due no descriptors
in the rx free descriptor queue, inside the netcp_rxpool_refill() function
the iterative loop to fill buffers doesn't terminate right away. So modify
the netcp_allocate_rx_buf() to return an error code and use it break the
loop when there is error.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netcp interface is not fully initialized before attach the module
to the interface. For example, the tx pipe/rx pipe is initialized
in ethss module as part of attach(). So until this is complete, the
interface can't be registered. So move registration of interface to
net device outside the current loop that attaches the modules to the
interface.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
netcp_core is the first driver that will get initialized and the modules
(ethss, pa etc) will then get initialized. So the code at the end of
netcp_probe() that iterate over the modules is a dead code as the module
list will be always be empty. So remove this code.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On K2HK, sgmii module registers of slave 0 and 1 are mem
mapped to one contiguous block, while those of slave 2
and 3 are mapped to another contiguous block. However,
on K2E and K2L, sgmii module registers of all slaves are
mem mapped to one contiguous block. SGMII APIs expect
slave 0 sgmii base when API is invoked for slave 0 and 1,
and slave 2 sgmii base when invoked for other slaves.
Before this patch, slave 0 sgmii base is always passed to
sgmii API for K2E regardless which slave is the API invoked
for. This patch fixes the problem.
Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
`ls /sys/devices/channel-devices/vnet-port-0-0/net' is missing without
this change, and applications like NetworkManager are looking in
sysfs for the information.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This platform driver has a OF device ID table but the OF module
alias information is not created so module autoloading won't work.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This platform driver has a OF device ID table but the OF module
alias information is not created so module autoloading won't work.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This platform driver has a OF device ID table but the OF module
alias information is not created so module autoloading won't work.
Signed-off-by: Luis de Bethencourt <luis@osg.samsung.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This platform driver has a OF device ID table but the OF module
alias information is not created so module autoloading won't work.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This platform driver has a OF device ID table but the OF module
alias information is not created so module autoloading won't work.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unless we reset the RX config, on real hardware I don't seem to receive
any packets after a TX timeout.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This can be called from cp_tx_timeout() with interrupts disabled.
Spotted by Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check for DMA mapping errors, recover from them and register them in
ethtool stats like other errors.
Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The callback for adding vxlan port can be called with the same port for
both IPv4 and IPv6. Do not disable the offloading when the same port for
both protocols is added and later one of them removed.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The callback for adding vxlan port can be called with the same port for both
IPv4 and IPv6. Do not disable the offloading if this occurs.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The callback for adding vxlan port can be called with the same port for
both IPv4 and IPv6. Do not disable the offloading when the same port for
both protocols is added and later one of them removed.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drivers needs to export the OF id table and this be built into
the module or udev won't have the necessary information to autoload
the driver module when the device is registered via OF.
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When changing rss key, we do not want to overwrite user provided key
by the one provided by netdev_rss_key_fill(), which is the host random
key generated at boot time.
Fixes: 947cbb0ac2 ("net/mlx4_en: Support for configurable RSS hash function")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eyal Perry <eyalpe@mellanox.com>
CC: Amir Vadai <amirv@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The reset delays used for stmmac are in the order of 10ms to 1 second,
which is far too long for udelay usage, so switch to using msleep.
Practically this fixes the PHY not being reliably detected in some cases
as udelay wouldn't actually delay for long enough to let the phy
reliably be reset.
Signed-off-by: Sjoerd Simons <sjoerd.simons@collabora.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a regression introduced by the commit a84e328941
("net: mvneta: fix refilling for Rx DMA buffers"). Due to this commit
the newly allocated Rx buffers are DMA-unmapped in place of those passed
to the networking stack. Obviously, this causes data corruptions.
This patch fixes the issue by ensuring that the right Rx buffers are
DMA-unmapped.
Reported-by: Oren Laskin <oren@igneous.io>
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Fixes: a84e328941 ("net: mvneta: fix refilling for Rx DMA buffers")
Cc: <stable@vger.kernel.org> # v3.8+
Tested-by: Oren Laskin <oren@igneous.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a preparatory patch for moving irq_data struct members. Search
and replace was done with coccinelle
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Amir Vadai <amirv@mellanox.com>
commit c48f350ff5 "bnx2x: Add MFW dump support" added the
bnx2x_update_mfw_dump() function that reads the current time and stores
it in a 32-bit field that gets passed into a buffer in a fixed format.
This is potentially broken when the epoch overflows in 2038, and
otherwise overflows in 2106. As we're trying to avoid uses of
struct timeval for this reason, I noticed the addition of this
function, and tried to rewrite it in a way that is more explicit
about the overflow and that will keep working once we deprecate
struct timeval.
I assume that it is not possible to change the ABI any more, otherwise
we should try to use a 64-bit field for the seconds right away.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Yuval Mintz <Yuval.Mintz@qlogic.com>
Cc: Ariel Elior <Ariel.Elior@qlogic.com>
Acked-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix out-of-bounds array access in netfilter ipset, from Jozsef
Kadlecsik.
2) Use correct free operation on netfilter conntrack templates, from
Daniel Borkmann.
3) Fix route leak in SCTP, from Marcelo Ricardo Leitner.
4) Fix sizeof(pointer) in mac80211, from Thierry Reding.
5) Fix cache pointer comparison in ip6mr leading to missed unlock of
mrt_lock. From Richard Laing.
6) rds_conn_lookup() needs to consider network namespace in key
comparison, from Sowmini Varadhan.
7) Fix deadlock in TIPC code wrt broadcast link wakeups, from Kolmakov
Dmitriy.
8) Fix fd leaks in bpf syscall, from Daniel Borkmann.
9) Fix error recovery when installing ipv6 multipath routes, we would
delete the old route before we would know if we could fully commit
to the new set of nexthops. Fix from Roopa Prabhu.
10) Fix run-time suspend problems in r8152, from Hayes Wang.
11) In fec, don't program the MAC address into the chip when the clocks
are gated off. From Fugang Duan.
12) Fix poll behavior for netlink sockets when using rx ring mmap, from
Daniel Borkmann.
13) Don't allocate memory with GFP_KERNEL from get_stats64 in r8169
driver, from Corinna Vinschen.
14) In TCP Cubic congestion control, handle idle periods better where we
are application limited, in order to keep cwnd from growing out of
control. From Eric Dumzet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
tcp_cubic: better follow cubic curve after idle period
tcp: generate CA_EVENT_TX_START on data frames
xen-netfront: respect user provided max_queues
xen-netback: respect user provided max_queues
r8169: Fix sleeping function called during get_stats64, v2
ether: add IEEE 1722 ethertype - TSN
netlink, mmap: fix edge-case leakages in nf queue zero-copy
netlink, mmap: don't walk rx ring on poll if receive queue non-empty
cxgb4: changes for new firmware 1.14.4.0
net: fec: add netif status check before set mac address
r8152: fix the runtime suspend issues
r8152: split DRIVER_VERSION
ipv6: fix ifnullfree.cocci warnings
add microchip LAN88xx phy driver
stmmac: fix check for phydev being open
net: qlcnic: delete redundant memsets
net: mv643xx_eth: use kzalloc
net: jme: use kzalloc() instead of kmalloc+memset
net: cavium: liquidio: use kzalloc in setup_glist()
net: ipv6: use common fib_default_rule_pref
...
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104031
Fixes: 6e85d5ad36
Based on the discussion starting at
http://www.spinics.net/lists/netdev/msg342193.html
Tested locally on RTL8168evl/8111evl with various concurrent processes
accessing /proc/net/dev while changing the link state as well as
removing/reloading the r8169 module.
Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Tested-by: poma <pomidorabelisima@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>