The LAN8740, like the 8720, also requires a reset after enabling clock.
The datasheet [1] 3.8.5.1 says:
"During a Hardware reset, an external clock must be supplied
to the XTAL1/CLKIN signal."
I have observed this issue on a custom i.MX6 based board with
the LAN8740A.
[1] http://ww1.microchip.com/downloads/en/DeviceDoc/8740a.pdf
Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Having Rx-only AF_XDP sockets can potentially lead to a crash in the
system by a NULL pointer dereference in xsk_umem_consume_tx(). This
function iterates through a list of all sockets tied to a umem and
checks if there are any packets to send on the Tx ring. Rx-only
sockets do not have a Tx ring, so this will cause a NULL pointer
dereference. This will happen if you have registered one or more
Rx-only sockets to a umem and the driver is checking the Tx ring even
on Rx, or if the XDP_SHARED_UMEM mode is used and there is a mix of
Rx-only and other sockets tied to the same umem.
Fixed by only putting sockets with a Tx component on the list that
xsk_umem_consume_tx() iterates over.
Fixes: ac98d8aab6 ("xsk: wire upp Tx zero-copy functions")
Reported-by: Kal Cutter Conley <kal.conley@dectris.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/bpf/1571645818-16244-1-git-send-email-magnus.karlsson@intel.com
UDP IPv6 packets auto flowlabels are using a 32bit secret
(static u32 hashrnd in net/core/flow_dissector.c) and
apply jhash() over fields known by the receivers.
Attackers can easily infer the 32bit secret and use this information
to identify a device and/or user, since this 32bit secret is only
set at boot time.
Really, using jhash() to generate cookies sent on the wire
is a serious security concern.
Trying to change the rol32(hash, 16) in ip6_make_flowlabel() would be
a dead end. Trying to periodically change the secret (like in sch_sfq.c)
could change paths taken in the network for long lived flows.
Let's switch to siphash, as we did in commit df453700e8
("inet: switch IP ID generator to siphash")
Using a cryptographically strong pseudo random function will solve this
privacy issue and more generally remove other weak points in the stack.
Packet schedulers using skb_get_hash_perturb() benefit from this change.
Fixes: b56774163f ("ipv6: Enable auto flow labels by default")
Fixes: 42240901f7 ("ipv6: Implement different admin modes for automatic flow labels")
Fixes: 67800f9b1f ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel")
Fixes: cb1ce2ef38 ("ipv6: Implement automatic flow label generation on transmit")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jonathan Berger <jonathann1@walla.com>
Reported-by: Amit Klein <aksecurity@gmail.com>
Reported-by: Benny Pinkas <benny@pinkas.net>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is one more problematic case I noticed while recently fixing BPF kallsyms
handling in cd7455f101 ("bpf: Fix use after free in subprog's jited symbol
removal") and that is bpf_get_prog_name().
If BTF has been attached to the prog, then we may be able to fetch the function
signature type id in kallsyms through prog->aux->func_info[prog->aux->func_idx].type_id.
However, while the BTF object itself is torn down via RCU callback, the prog's
aux->func_info is immediately freed via kvfree(prog->aux->func_info) once the
prog's refcount either hit zero or when subprograms were already exposed via
kallsyms and we hit the error path added in 5482e9a93c ("bpf: Fix memleak in
aux->func_info and aux->btf").
This violates RCU as well since kallsyms could be walked in parallel where we
could access aux->func_info. Hence, defer kvfree() to after RCU grace period.
Looking at ba64e7d852 ("bpf: btf: support proper non-jit func info") there
is no reason/dependency where we couldn't defer the kvfree(aux->func_info) into
the RCU callback.
Fixes: 5482e9a93c ("bpf: Fix memleak in aux->func_info and aux->btf")
Fixes: ba64e7d852 ("bpf: btf: support proper non-jit func info")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/875f2906a7c1a0691f2d567b4d8e4ea2739b1e88.1571779205.git.daniel@iogearbox.net
Include <net/addrconf.h> for the missing declarations of
various functions. Fixes the following sparse warnings:
net/ipv6/addrconf_core.c:94:5: warning: symbol 'register_inet6addr_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c💯5: warning: symbol 'unregister_inet6addr_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:106:5: warning: symbol 'inet6addr_notifier_call_chain' was not declared. Should it be static?
net/ipv6/addrconf_core.c:112:5: warning: symbol 'register_inet6addr_validator_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:118:5: warning: symbol 'unregister_inet6addr_validator_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:125:5: warning: symbol 'inet6addr_validator_notifier_call_chain' was not declared. Should it be static?
net/ipv6/addrconf_core.c:237:6: warning: symbol 'in6_dev_finish_destroy' was not declared. Should it be static?
Signed-off-by: Ben Dooks (Codethink) <ben.dooks@codethink.co.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Kernel test robot reported that the l2tp.sh test script failed:
# selftests: net: l2tp.sh
# Warning: file l2tp.sh is not executable, correct this.
Set executable bits.
Fixes: e858ef1cd4 ("selftests: Add l2tp tests")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
We get one warnings when build kernel W=1:
net/sched/sch_taprio.c:1155:6: warning: no previous prototype for ‘taprio_offload_config_changed’ [-Wmissing-prototypes]
Make the function static to fix this.
Fixes: 9c66d15646 ("taprio: Add support for hardware offloading")
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Michael Chan says:
====================
Devlink and error recovery bug fix patches.
Most of the work is by Vasundhara Volam.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
With the recently added error recovery logic, the device may already
be disabled if the firmware recovery is unsuccessful. In
bnxt_remove_one(), check that the device is still enabled first
before calling pci_disable_device().
Fixes: 3bc7d4a352 ("bnxt_en: Add BNXT_STATE_IN_FW_RESET state.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
When firmware indicates that driver needs to invoke firmware reset
which is common for both error recovery and live firmware reset path,
driver needs a different time to wait before polling for firmware
readiness.
Modify the wait time to fw_reset_min_dsecs, which is initialised to
correct timeout for error recovery and firmware reset.
Fixes: 4037eb7156 ("bnxt_en: Add a new BNXT_FW_RESET_STATE_POLL_FW_DOWN state.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
The current code does not do endian swapping between the devlink
parameter and the internal NVRAM representation. Define a union to
represent the little endian NVRAM data and add 2 helper functions to
copy to and from the NVRAM data with the proper byte swapping.
Fixes: 782a624d00 ("bnxt_en: Add bnxt_en initial port params table and register it")
Cc: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
The current code that rounds up the NVRAM parameter bit size to the next
byte size for the devlink parameter is not always correct. The MSIX
devlink parameters are 4 bytes and we don't get the correct size
using this method.
Fix it by adding a new dl_num_bytes member to the bnxt_dl_nvm_param
structure which statically provides bytesize information according
to the devlink parameter type definition.
Fixes: 782a624d00 ("bnxt_en: Add bnxt_en initial port params table and register it")
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
When the address width of DMA is greater than 32, the packet header occupies
a BD descriptor. The starting address of the data should be added to the
header length.
Fixes: a993db88d1 ("net: stmmac: Enable support for > 32 Bits addressing in XGMAC")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Jose Abreu <joabreu@synopsys.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: yuqi jin <jinyuqi@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
The ionic driver started using dymamic_hex_dump(), but
that is not always defined:
drivers/net/ethernet/pensando/ionic/ionic_main.c:229:2: error: implicit declaration of function 'dynamic_hex_dump' [-Werror,-Wimplicit-function-declaration]
Add a dummy implementation to use when CONFIG_DYNAMIC_DEBUG
is disabled, printing nothing.
Fixes: 938962d552 ("ionic: Add adminq action")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
syzkaller managed to trigger the following crash:
[...]
BUG: unable to handle page fault for address: ffffc90001923030
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD aa551067 P4D aa551067 PUD aa552067 PMD a572b067 PTE 80000000a1173163
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 7982 Comm: syz-executor912 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:bpf_jit_binary_hdr include/linux/filter.h:787 [inline]
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:531 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:600 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find kernel/bpf/core.c:674 [inline]
RIP: 0010:is_bpf_text_address+0x184/0x3b0 kernel/bpf/core.c:709
[...]
Call Trace:
kernel_text_address kernel/extable.c:147 [inline]
__kernel_text_address+0x9a/0x110 kernel/extable.c:102
unwind_get_return_address+0x4c/0x90 arch/x86/kernel/unwind_frame.c:19
arch_stack_walk+0x98/0xe0 arch/x86/kernel/stacktrace.c:26
stack_trace_save+0xb6/0x150 kernel/stacktrace.c:123
save_stack mm/kasan/common.c:69 [inline]
set_track mm/kasan/common.c:77 [inline]
__kasan_kmalloc+0x11c/0x1b0 mm/kasan/common.c:510
kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:518
slab_post_alloc_hook mm/slab.h:584 [inline]
slab_alloc mm/slab.c:3319 [inline]
kmem_cache_alloc+0x1f5/0x2e0 mm/slab.c:3483
getname_flags+0xba/0x640 fs/namei.c:138
getname+0x19/0x20 fs/namei.c:209
do_sys_open+0x261/0x560 fs/open.c:1091
__do_sys_open fs/open.c:1115 [inline]
__se_sys_open fs/open.c:1110 [inline]
__x64_sys_open+0x87/0x90 fs/open.c:1110
do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
[...]
After further debugging it turns out that we walk kallsyms while in parallel
we tear down a BPF program which contains subprograms that have been JITed
though the program itself has not been fully exposed and is eventually bailing
out with error.
The bpf_prog_kallsyms_del_subprogs() in bpf_prog_load()'s error path removes
the symbols, however, bpf_prog_free() tears down the JIT memory too early via
scheduled work. Instead, it needs to properly respect RCU grace period as the
kallsyms walk for BPF is under RCU.
Fix it by refactoring __bpf_prog_put()'s tear down and reuse it in our error
path where we defer final destruction when we have subprogs in the program.
Fixes: 7d1982b4e3 ("bpf: fix panic in prog load calls cleanup")
Fixes: 1c2a088a66 ("bpf: x64: add JIT support for multi-function programs")
Reported-by: syzbot+710043c5d1d5b5013bc7@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: syzbot+710043c5d1d5b5013bc7@syzkaller.appspotmail.com
Link: https://lore.kernel.org/bpf/55f6367324c2d7e9583fa9ccf5385dcbba0d7a6e.1571752452.git.daniel@iogearbox.net
It seems I forgot to add handling of devmap_hash type maps to the device
unregister hook for devmaps. This omission causes devices to not be
properly released, which causes hangs.
Fix this by adding the missing handler.
Fixes: 6f9d451ab1 ("xdp: Add devmap_hash map type for looking up devices by hashed index")
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191019111931.2981954-1-toke@redhat.com
This device is sold as 'ThinkPad USB-C Dock Gen 2 (40AS)'.
Chipset is RTL8153 and works with r8152.
Without this, the generic cdc_ether grabs the device, and the device jam
connected networks up when the machine suspends.
Signed-off-by: Kazutoshi Noguchi <noguchi.kazutosi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the iph field from the state structure, which is not
properly initialized. Instead, add a new field to make the "do we want
to set DF" be the state bit and move the code to set the DF flag from
ip_frag_next().
Joint work with Pablo and Linus.
Fixes: 19c3401a91 ("net: ipv4: place control buffer handling away from fragmentation iterators")
Reported-by: Patrick Schönthaler <patrick@notvads.ovh>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl2qDnsACgkQSD+KveBX
+j60KQgAo4bJcC3e0p38gZvt6GMt5Li0+BK1OnhIf/hW9DsEJPzqNtw4g7hqXCU+
cH0YqB3q26Gq6Qg9Akuza/rY8Vfmpl1kZzfqg5/7REqC1jJBwBeIh8lOpwQh4QEh
m//hxSq/r9M59J1SdQzUu8H7xj1+aD0xrJJ3LCL2uVwe1pk7ho8sOAwjCwvidJaJ
fb56sBrkWjcL/k/Zxtic2e9Y3awfLWwN0QifPOWc/tehaMCwUpc04/o4yAyjyOy5
jY5pioEYZ6BBlLHNVZO2/kjXPIQQQt9n6H+60uVud9DWDhGwQMI2b1/3LJha6q1t
Hg2/wiHIcsoe5TZQoI5HiQNTIw6bVw==
=jBeW
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2019-10-18' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 kTLS fixes 18-10-2019
This series introduces kTLS related fixes to mlx5 driver from Tariq,
and two misc memory leak fixes form Navid Emamdoost.
Please pull and let me know if there is any problem.
I would appreciate it if you queue up kTLS fixes from the list below to
stable kernel v5.3 !
For -stable v4.13:
nett/mlx5: prevent memory leak in mlx5_fpga_conn_create_cq
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"I was battling a cold after some recent trips, so quite a bit piled up
meanwhile, sorry about that.
Highlights:
1) Fix fd leak in various bpf selftests, from Brian Vazquez.
2) Fix crash in xsk when device doesn't support some methods, from
Magnus Karlsson.
3) Fix various leaks and use-after-free in rxrpc, from David Howells.
4) Fix several SKB leaks due to confusion of who owns an SKB and who
should release it in the llc code. From Eric Biggers.
5) Kill a bunc of KCSAN warnings in TCP, from Eric Dumazet.
6) Jumbo packets don't work after resume on r8169, as the BIOS resets
the chip into non-jumbo mode during suspend. From Heiner Kallweit.
7) Corrupt L2 header during MPLS push, from Davide Caratti.
8) Prevent possible infinite loop in tc_ctl_action, from Eric
Dumazet.
9) Get register bits right in bcmgenet driver, based upon chip
version. From Florian Fainelli.
10) Fix mutex problems in microchip DSA driver, from Marek Vasut.
11) Cure race between route lookup and invalidation in ipv4, from Wei
Wang.
12) Fix performance regression due to false sharing in 'net'
structure, from Eric Dumazet"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (145 commits)
net: reorder 'struct net' fields to avoid false sharing
net: dsa: fix switch tree list
net: ethernet: dwmac-sun8i: show message only when switching to promisc
net: aquantia: add an error handling in aq_nic_set_multicast_list
net: netem: correct the parent's backlog when corrupted packet was dropped
net: netem: fix error path for corrupted GSO frames
macb: propagate errors when getting optional clocks
xen/netback: fix error path of xenvif_connect_data()
net: hns3: fix mis-counting IRQ vector numbers issue
net: usb: lan78xx: Connect PHY before registering MAC
vsock/virtio: discard packets if credit is not respected
vsock/virtio: send a credit update when buffer size is changed
mlxsw: spectrum_trap: Push Ethernet header before reporting trap
net: ensure correct skb->tstamp in various fragmenters
net: bcmgenet: reset 40nm EPHY on energy detect
net: bcmgenet: soft reset 40nm EPHYs before MAC init
net: phy: bcm7xxx: define soft_reset for 40nm EPHY
net: bcmgenet: don't set phydev->link from MAC
net: Update address for MediaTek ethernet driver in MAINTAINERS
ipv4: fix race condition between route lookup and invalidation
...
Intel test robot reported a ~7% regression on TCP_CRR tests
that they bisected to the cited commit.
Indeed, every time a new TCP socket is created or deleted,
the atomic counter net->count is touched (via get_net(net)
and put_net(net) calls)
So cpus might have to reload a contended cache line in
net_hash_mix(net) calls.
We need to reorder 'struct net' fields to move @hash_mix
in a read mostly cache line.
We move in the first cache line fields that can be
dirtied often.
We probably will have to address in a followup patch
the __randomize_layout that was added in linux-4.13,
since this might break our placement choices.
Fixes: 355b985537 ("netns: provide pure entropy for net_hash_mix()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If there are multiple switch trees on the device, only the last one
will be listed, because the arguments of list_add_tail are swapped.
Fixes: 83c0afaec7 ("net: dsa: Add new binding implementation")
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Printing the info message every time more than the max number of mac
addresses are requested generates unnecessary log spam. Showing it only
when the hw is not already in promiscous mode is equally informative
without being annoying.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
add an error handling in aq_nic_set_multicast_list, it may not
work when hw_multicast_list_set error; and at the same time
it will remove gcc Wunused-but-set-variable warning.
Signed-off-by: Chenwandun <chenwandun@huawei.com>
Reviewed-by: Igor Russkikh <igor.russkikh@aquantia.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
net: netem: fix further issues with packet corruption
This set is fixing two more issues with the netem packet corruption.
First patch (which was previously posted) avoids NULL pointer dereference
if the first frame gets freed due to allocation or checksum failure.
v2 improves the clarity of the code a little as requested by Cong.
Second patch ensures we don't return SUCCESS if the frame was in fact
dropped. Thanks to this commit message for patch 1 no longer needs the
"this will still break with a single-frame failure" disclaimer.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If packet corruption failed we jump to finish_segs and return
NET_XMIT_SUCCESS. Seeing success will make the parent qdisc
increment its backlog, that's incorrect - we need to return
NET_XMIT_DROP.
Fixes: 6071bd1aa1 ("netem: Segment GSO packets on enqueue")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To corrupt a GSO frame we first perform segmentation. We then
proceed using the first segment instead of the full GSO skb and
requeue the rest of the segments as separate packets.
If there are any issues with processing the first segment we
still want to process the rest, therefore we jump to the
finish_segs label.
Commit 177b800746 ("net: netem: fix backlog accounting for
corrupted GSO frames") started using the pointer to the first
segment in the "rest of segments processing", but as mentioned
above the first segment may had already been freed at this point.
Backlog corrections for parent qdiscs have to be adjusted.
Fixes: 177b800746 ("net: netem: fix backlog accounting for corrupted GSO frames")
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tx_clk, rx_clk, and tsu_clk are optional. Currently the macb driver
marks clock as not available if it receives an error when trying to get
a clock. This is wrong, because a clock controller might return
-EPROBE_DEFER if a clock is not available, but will eventually become
available.
In these cases, the driver would probe successfully but will never be
able to adjust the clocks, because the clocks were not available during
probe, but became available later.
For example, the clock controller for the ZynqMP is implemented in the
PMU firmware and the clocks are only available after the firmware driver
has been probed.
Use devm_clk_get_optional() in instead of devm_clk_get() to get the
optional clock and propagate all errors to the calling function.
Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Tested-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
xenvif_connect_data() calls module_put() in case of error. This is
wrong as there is no related module_get().
Remove the superfluous module_put().
Fixes: 279f438e36 ("xen-netback: Don't destroy the netdev until the vif is shut down")
Cc: <stable@vger.kernel.org> # 3.12
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the num_msi_left means the vector numbers of NIC,
but if the PF supported RoCE, it contains the vector numbers
of NIC and RoCE(Not expected).
This may cause interrupts lost in some case, because of the
NIC module used the vector resources which belongs to RoCE.
This patch adds a new variable num_nic_msi to store the vector
numbers of NIC, and adjust the default TQP numbers and rss_size
according to the value of num_nic_msi.
Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge misc fixes from Andrew Morton:
"Rather a lot of fixes, almost all affecting mm/"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (26 commits)
scripts/gdb: fix debugging modules on s390
kernel/events/uprobes.c: only do FOLL_SPLIT_PMD for uprobe register
mm/thp: allow dropping THP from page cache
mm/vmscan.c: support removing arbitrary sized pages from mapping
mm/thp: fix node page state in split_huge_page_to_list()
proc/meminfo: fix output alignment
mm/init-mm.c: include <linux/mman.h> for vm_committed_as_batch
mm/filemap.c: include <linux/ramfs.h> for generic_file_vm_ops definition
mm: include <linux/huge_mm.h> for is_vma_temporary_stack
zram: fix race between backing_dev_show and backing_dev_store
mm/memcontrol: update lruvec counters in mem_cgroup_move_account
ocfs2: fix panic due to ocfs2_wq is null
hugetlbfs: don't access uninitialized memmaps in pfn_range_valid_gigantic()
mm: memblock: do not enforce current limit for memblock_phys* family
mm: memcg: get number of pages on the LRU list in memcgroup base on lru_zone_size
mm/gup: fix a misnamed "write" argument, and a related bug
mm/gup_benchmark: add a missing "w" to getopt string
ocfs2: fix error handling in ocfs2_setattr()
mm: memcg/slab: fix panic in __free_slab() caused by premature memcg pointer release
mm/memunmap: don't access uninitialized memmap in memunmap_pages()
...
Currently lx-symbols assumes that module text is always located at
module->core_layout->base, but s390 uses the following layout:
+------+ <- module->core_layout->base
| GOT |
+------+ <- module->core_layout->base + module->arch->plt_offset
| PLT |
+------+ <- module->core_layout->base + module->arch->plt_offset +
| TEXT | module->arch->plt_size
+------+
Therefore, when trying to debug modules on s390, all the symbol
addresses are skewed by plt_offset + plt_size.
Fix by adding plt_offset + plt_size to module_addr in
load_module_symbols().
Link: http://lkml.kernel.org/r/20191017085917.81791-1-iii@linux.ibm.com
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Attaching uprobe to text section in THP splits the PMD mapped page table
into PTE mapped entries. On uprobe detach, we would like to regroup PMD
mapped page table entry to regain performance benefit of THP.
However, the regroup is broken For perf_event based trace_uprobe. This
is because perf_event based trace_uprobe calls uprobe_unregister twice
on close: first in TRACE_REG_PERF_CLOSE, then in
TRACE_REG_PERF_UNREGISTER. The second call will split the PMD mapped
page table entry, which is not the desired behavior.
Fix this by only use FOLL_SPLIT_PMD for uprobe register case.
Add a WARN() to confirm uprobe unregister never work on huge pages, and
abort the operation when this WARN() triggers.
Link: http://lkml.kernel.org/r/20191017164223.2762148-6-songliubraving@fb.com
Fixes: 5a52c9df62 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT")
Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Once a THP is added to the page cache, it cannot be dropped via
/proc/sys/vm/drop_caches. Fix this issue with proper handling in
invalidate_mapping_pages().
Link: http://lkml.kernel.org/r/20191017164223.2762148-5-songliubraving@fb.com
Fixes: 99cb0dbd47 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Tested-by: Song Liu <songliubraving@fb.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__remove_mapping() assumes that pages can only be either base pages or
HPAGE_PMD_SIZE. Ask the page what size it is.
Link: http://lkml.kernel.org/r/20191017164223.2762148-4-songliubraving@fb.com
Fixes: 99cb0dbd47 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make sure split_huge_page_to_list() handles the state of shmem THP and
file THP properly.
Link: http://lkml.kernel.org/r/20191017164223.2762148-3-songliubraving@fb.com
Fixes: 60fbf0ab5d ("mm,thp: stats for file backed THP")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Tested-by: Song Liu <songliubraving@fb.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "Fixes for THP in page cache", v2.
This patch (of 5):
Add extra space for FileHugePages and FilePmdMapped, so the output is
aligned with other rows.
Link: http://lkml.kernel.org/r/20191017164223.2762148-2-songliubraving@fb.com
Fixes: 60fbf0ab5d ("mm,thp: stats for file backed THP")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Tested-by: Song Liu <songliubraving@fb.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm_init.c needs to include <linux/mman.h> for the definition of
vm_committed_as_batch. Fixes the following sparse warning:
mm/mm_init.c:141:5: warning: symbol 'vm_committed_as_batch' was not declared. Should it be static?
Link: http://lkml.kernel.org/r/20191016091509.26708-1-ben.dooks@codethink.co.uk
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The generic_file_vm_ops is defined in <linux/ramfs.h> so include it to
fix the following warning:
mm/filemap.c:2717:35: warning: symbol 'generic_file_vm_ops' was not declared. Should it be static?
Link: http://lkml.kernel.org/r/20191008102311.25432-1-ben.dooks@codethink.co.uk
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Include <linux/huge_mm.h> for the definition of is_vma_temporary_stack
to fix the following sparse warning:
mm/rmap.c:1673:6: warning: symbol 'is_vma_temporary_stack' was not declared. Should it be static?
Link: http://lkml.kernel.org/r/20191009151155.27763-1-ben.dooks@codethink.co.uk
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reviewed-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CPU0: CPU1:
backing_dev_show backing_dev_store
...... ......
file = zram->backing_dev;
down_read(&zram->init_lock); down_read(&zram->init_init_lock)
file_path(file, ...); zram->backing_dev = backing_dev;
up_read(&zram->init_lock); up_read(&zram->init_lock);
gets the value of zram->backing_dev too early in backing_dev_show, which
resultin the value being NULL at the beginning, and not NULL later.
backtrace:
d_path+0xcc/0x174
file_path+0x10/0x18
backing_dev_show+0x40/0xb4
dev_attr_show+0x20/0x54
sysfs_kf_seq_show+0x9c/0x10c
kernfs_seq_show+0x28/0x30
seq_read+0x184/0x488
kernfs_fop_read+0x5c/0x1a4
__vfs_read+0x44/0x128
vfs_read+0xa0/0x138
SyS_read+0x54/0xb4
Link: http://lkml.kernel.org/r/1571046839-16814-1-git-send-email-chenwandun@huawei.com
Signed-off-by: Chenwandun <chenwandun@huawei.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mapped, dirty and writeback pages are also counted in per-lruvec stats.
These counters needs update when page is moved between cgroups.
Currently is nobody *consuming* the lruvec versions of these counters and
that there is no user-visible effect.
Link: http://lkml.kernel.org/r/157112699975.7360.1062614888388489788.stgit@buzz
Fixes: 00f3ca2c2d ("mm: memcontrol: per-lruvec stats infrastructure")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Uninitialized memmaps contain garbage and in the worst case trigger
kernel BUGs, especially with CONFIG_PAGE_POISONING. They should not get
touched.
Let's make sure that we only consider online memory (managed by the
buddy) that has initialized memmaps. ZONE_DEVICE is not applicable.
page_zone() will call page_to_nid(), which will trigger
VM_BUG_ON_PGFLAGS(PagePoisoned(page), page) with CONFIG_PAGE_POISONING
and CONFIG_DEBUG_VM_PGFLAGS when called on uninitialized memmaps. This
can be the case when an offline memory block (e.g., never onlined) is
spanned by a zone.
Note: As explained by Michal in [1], alloc_contig_range() will verify
the range. So it boils down to the wrong access in this function.
[1] http://lkml.kernel.org/r/20180423000943.GO17484@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20191015120717.4858-1-david@redhat.com
Fixes: f1dd2cd13c ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e86b]
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Michal Hocko <mhocko@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: <stable@vger.kernel.org> [4.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Until commit 92d12f9544 ("memblock: refactor internal allocation
functions") the maximal address for memblock allocations was forced to
memblock.current_limit only for the allocation functions returning
virtual address. The changes introduced by that commit moved the limit
enforcement into the allocation core and as a result the allocation
functions returning physical address also started to limit allocations
to memblock.current_limit.
This caused breakage of etnaviv GPU driver:
etnaviv etnaviv: bound 130000.gpu (ops gpu_ops)
etnaviv etnaviv: bound 134000.gpu (ops gpu_ops)
etnaviv etnaviv: bound 2204000.gpu (ops gpu_ops)
etnaviv-gpu 130000.gpu: model: GC2000, revision: 5108
etnaviv-gpu 130000.gpu: command buffer outside valid memory window
etnaviv-gpu 134000.gpu: model: GC320, revision: 5007
etnaviv-gpu 134000.gpu: command buffer outside valid memory window
etnaviv-gpu 2204000.gpu: model: GC355, revision: 1215
etnaviv-gpu 2204000.gpu: Ignoring GPU with VG and FE2.0
Restore the behaviour of memblock_phys* family so that these functions
will not enforce memblock.current_limit.
Link: http://lkml.kernel.org/r/1570915861-17633-1-git-send-email-rppt@kernel.org
Fixes: 92d12f9544 ("memblock: refactor internal allocation functions")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reported-by: Adam Ford <aford173@gmail.com>
Tested-by: Adam Ford <aford173@gmail.com> [imx6q-logicpd]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>