Add BPF_XDP_DEVMAP attach type for use with programs associated with a
DEVMAP entry.
Allow DEVMAPs to associate a program with a device entry by adding
a bpf_prog.fd to 'struct bpf_devmap_val'. Values read show the program
id, so the fd and id are a union. bpf programs can get access to the
struct via vmlinux.h.
The program associated with the fd must have type XDP with expected
attach type BPF_XDP_DEVMAP. When a program is associated with a device
index, the program is run on an XDP_REDIRECT and before the buffer is
added to the per-cpu queue. At this point rxq data is still valid; the
next patch adds tx device information allowing the prorgam to see both
ingress and egress device indices.
XDP generic is skb based and XDP programs do not work with skb's. Block
the use case by walking maps used by a program that is to be attached
via xdpgeneric and fail if any of them are DEVMAP / DEVMAP_HASH with
Block attach of BPF_XDP_DEVMAP programs to devices.
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200529220716.75383-3-dsahern@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add "rx_queue_mapping" to bpf_sock. This gives read access for the
existing field (sk_rx_queue_mapping) of struct sock from bpf_sock.
Semantics for the bpf_sock rx_queue_mapping access are similar to
sk_rx_queue_get(), i.e the value NO_QUEUE_MAPPING is not allowed
and -1 is returned in that case. This is useful for transmit queue
selection based on the received queue index which is cached in the
socket in the receive path.
v3: Addressed review comments to add usecase in patch description,
and fixed default value for rx_queue_mapping.
v2: fixed build error for CONFIG_XPS wrapping, reported by
kbuild test robot <lkp@intel.com>
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add these generic helpers that may be useful to use from sk_msg programs.
The helpers do not depend on ctx so we can simply add them here,
BPF_FUNC_perf_event_output
BPF_FUNC_get_current_uid_gid
BPF_FUNC_get_current_pid_tgid
BPF_FUNC_get_current_cgroup_id
BPF_FUNC_get_current_ancestor_cgroup_id
BPF_FUNC_get_cgroup_classid
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/159033903373.12355.15489763099696629346.stgit@john-Precision-5820-Tower
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Compiling with W=1 gives the following warning:
net/sched/cls_flower.c:731:1: warning: ‘mpls_opts_policy’ defined but not used [-Wunused-const-variable=]
The TCA_FLOWER_KEY_MPLS_OPTS contains a list of
TCA_FLOWER_KEY_MPLS_OPTS_LSE. Therefore, the attributes all have the
same type and we can't parse the list with nla_parse*() and have the
attributes validated automatically using an nla_policy.
fl_set_key_mpls_opts() properly verifies that all attributes in the
list are TCA_FLOWER_KEY_MPLS_OPTS_LSE. Then fl_set_key_mpls_lse()
uses nla_parse_nested() on all these attributes, thus verifying that
they have the NLA_F_NESTED flag. So we can safely drop the
mpls_opts_policy.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull crypto updates from Herbert Xu:
"API:
- Introduce crypto_shash_tfm_digest() and use it wherever possible.
- Fix use-after-free and race in crypto_spawn_alg.
- Add support for parallel and batch requests to crypto_engine.
Algorithms:
- Update jitter RNG for SP800-90B compliance.
- Always use jitter RNG as seed in drbg.
Drivers:
- Add Arm CryptoCell driver cctrng.
- Add support for SEV-ES to the PSP driver in ccp"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (114 commits)
crypto: hisilicon - fix driver compatibility issue with different versions of devices
crypto: engine - do not requeue in case of fatal error
crypto: cavium/nitrox - Fix a typo in a comment
crypto: hisilicon/qm - change debugfs file name from qm_regs to regs
crypto: hisilicon/qm - add DebugFS for xQC and xQE dump
crypto: hisilicon/zip - add debugfs for Hisilicon ZIP
crypto: hisilicon/hpre - add debugfs for Hisilicon HPRE
crypto: hisilicon/sec2 - add debugfs for Hisilicon SEC
crypto: hisilicon/qm - add debugfs to the QM state machine
crypto: hisilicon/qm - add debugfs for QM
crypto: stm32/crc32 - protect from concurrent accesses
crypto: stm32/crc32 - don't sleep in runtime pm
crypto: stm32/crc32 - fix multi-instance
crypto: stm32/crc32 - fix run-time self test issue.
crypto: stm32/crc32 - fix ext4 chksum BUG_ON()
crypto: hisilicon/zip - Use temporary sqe when doing work
crypto: hisilicon - add device error report through abnormal irq
crypto: hisilicon - remove codes of directly report device errors through MSI
crypto: hisilicon - QM memory management optimization
crypto: hisilicon - unify initial value assignment into QM
...
A node that has the MRA role, it can behave as MRM or MRC.
Initially it starts as MRM and sends MRP_Test frames on both ring ports.
If it detects that there are MRP_Test send by another MRM, then it
checks if these frames have a lower priority than itself. In this case
it would send MRP_Nack frames to notify the other node that it needs to
stop sending MRP_Test frames.
If it receives a MRP_Nack frame then it stops sending MRP_Test frames
and starts to behave as a MRC but it would continue to monitor the
MRP_Test frames send by MRM. If at a point the MRM stops to send
MRP_Test frames it would get the MRM role and start to send MRP_Test
frames.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each MRP instance has a priority, a lower value means a higher priority.
The priority of MRP instance is stored in MRP_Test frame in this way
all the MRP nodes in the ring can see other nodes priority.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The fl_flow_key structure is around 500 bytes, so having two of them
on the stack in one function now exceeds the warning limit after an
otherwise correct change:
net/sched/cls_flower.c:298:12: error: stack frame size of 1056 bytes in function 'fl_classify' [-Werror,-Wframe-larger-than=]
I suspect the fl_classify function could be reworked to only have one
of them on the stack and modify it in place, but I could not work out
how to do that.
As a somewhat hacky workaround, move one of them into an out-of-line
function to reduce its scope. This does not necessarily reduce the stack
usage of the outer function, but at least the second copy is removed
from the stack during most of it and does not add up to whatever is
called from there.
I now see 552 bytes of stack usage for fl_classify(), plus 528 bytes
for fl_mask_lookup().
Fixes: 58cff782cc ("flow_dissector: Parse multiple MPLS Label Stack Entries")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add packet traps for packets that are sampled / trapped by ACLs, so that
capable drivers could register them with devlink. Add documentation for
every added packet trap and packet trap group.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add layer 3 control packet traps such as ARP and DHCP, so that capable
device drivers could register them with devlink. Add documentation for
every added packet trap and packet trap group.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add layer 2 control packet traps such as STP and IGMP query, so that
capable device drivers could register them with devlink. Add
documentation for every added packet trap and packet trap group.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This type is used for traps that trap control packets such as ARP
request and IGMP query to the CPU.
Do not report such packets to the kernel's drop monitor as they were not
dropped by the device no encountered an exception during forwarding.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The action is used by control traps such as IGMP query. The packet is
flooded by the device, but also trapped to the CPU in order for the
software bridge to mark the receiving port as a multicast router port.
Such packets are marked with 'skb->offload_fwd_mark = 1' in order to
prevent the software bridge from flooding them again.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packets that hit exceptions during layer 3 forwarding must be trapped to
the CPU for the control plane to function properly. Create a dedicated
group for them, so that user space could choose to assign a different
policer for them.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next
to extend ctnetlink and the flowtable infrastructure:
1) Extend ctnetlink kernel side netlink dump filtering capabilities,
from Romain Bellan.
2) Generalise the flowtable hook parser to take a hook list.
3) Pass a hook list to the flowtable hook registration/unregistration.
4) Add a helper function to release the flowtable hook list.
5) Update the flowtable event notifier to pass a flowtable hook list.
6) Allow users to add new devices to an existing flowtables.
7) Allow users to remove devices to an existing flowtables.
8) Allow for registering a flowtable with no initial devices.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Drivers do not register to netdev events to set up indirect blocks
anymore. Remove __flow_indr_block_cb_register() and
__flow_indr_block_cb_unregister().
The frontends set up the callbacks through flow_indr_dev_setup_block()
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update existing frontends to use flow_indr_dev_setup_offload().
This new function must be called if ->ndo_setup_tc is unset to deal
with tunnel devices.
If there is no driver that is subscribed to new tunnel device
flow_block bindings, then this function bails out with EOPNOTSUPP.
If the driver module is removed, the ->cleanup() callback removes the
entries that belong to this tunnel device. This cleanup procedures is
triggered when the device unregisters the tunnel device offload handler.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper function to initialize the flow_block_offload structure.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tunnel devices provide no dev->netdev_ops->ndo_setup_tc(...) interface.
The tunnel device and route control plane does not provide an obvious
way to relate tunnel and physical devices.
This patch allows drivers to register a tunnel device offload handler
for the tc and netfilter frontends through flow_indr_dev_register() and
flow_indr_dev_unregister().
The frontend calls flow_indr_dev_setup_offload() that iterates over the
list of drivers that are offering tunnel device hardware offload
support and it sets up the flow block for this tunnel device.
If the driver module is removed, the indirect flow_block ends up with a
stale callback reference. The module removal path triggers the
dev_shutdown() path to remove the qdisc and the flow_blocks for the
physical devices. However, this is not useful for tunnel devices, where
relation between the physical and the tunnel device is not explicit.
This patch introduces a cleanup callback that is invoked when the driver
module is removed to clean up the tunnel device flow_block. This patch
defines struct flow_block_indr and it uses it from flow_block_cb to
store the information that front-end requires to perform the
flow_block_cb cleanup on module removal.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function schedules the flow teardown state and it forces a gc run.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
trying to configure TC 'act_gate' rules with invalid control actions, the
following splat can be observed:
general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
CPU: 1 PID: 2143 Comm: tc Not tainted 5.7.0-rc6+ #168
Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
RIP: 0010:hrtimer_active+0x56/0x290
[...]
Call Trace:
hrtimer_try_to_cancel+0x6d/0x330
hrtimer_cancel+0x11/0x20
tcf_gate_cleanup+0x15/0x30 [act_gate]
tcf_action_cleanup+0x58/0x170
__tcf_action_put+0xb0/0xe0
__tcf_idr_release+0x68/0x90
tcf_gate_init+0x7c7/0x19a0 [act_gate]
tcf_action_init_1+0x60f/0x960
tcf_action_init+0x157/0x2a0
tcf_action_add+0xd9/0x2f0
tc_ctl_action+0x2a3/0x39d
rtnetlink_rcv_msg+0x5f3/0x920
netlink_rcv_skb+0x121/0x350
netlink_unicast+0x439/0x630
netlink_sendmsg+0x714/0xbf0
sock_sendmsg+0xe2/0x110
____sys_sendmsg+0x5b4/0x890
___sys_sendmsg+0xe9/0x160
__sys_sendmsg+0xd3/0x170
do_syscall_64+0x9a/0x370
entry_SYSCALL_64_after_hwframe+0x44/0xa9
this is caused by hrtimer_cancel(), running before hrtimer_init(). Fix it
ensuring to call hrtimer_cancel() only if clockid is valid, and the timer
has been initialized. After fixing this splat, the same error path causes
another problem:
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 1 PID: 980 Comm: tc Not tainted 5.7.0-rc6+ #168
Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
RIP: 0010:release_entry_list+0x4a/0x240 [act_gate]
[...]
Call Trace:
tcf_action_cleanup+0x58/0x170
__tcf_action_put+0xb0/0xe0
__tcf_idr_release+0x68/0x90
tcf_gate_init+0x7ab/0x19a0 [act_gate]
tcf_action_init_1+0x60f/0x960
tcf_action_init+0x157/0x2a0
tcf_action_add+0xd9/0x2f0
tc_ctl_action+0x2a3/0x39d
rtnetlink_rcv_msg+0x5f3/0x920
netlink_rcv_skb+0x121/0x350
netlink_unicast+0x439/0x630
netlink_sendmsg+0x714/0xbf0
sock_sendmsg+0xe2/0x110
____sys_sendmsg+0x5b4/0x890
___sys_sendmsg+0xe9/0x160
__sys_sendmsg+0xd3/0x170
do_syscall_64+0x9a/0x370
entry_SYSCALL_64_after_hwframe+0x44/0xa9
the problem is similar: tcf_action_cleanup() was trying to release a list
without initializing it first. Ensure that INIT_LIST_HEAD() is called for
every newly created 'act_gate' action, same as what was done to 'act_ife'
with commit 44c23d7159 ("net/sched: act_ife: initalize ife->metalist
earlier").
Fixes: a51c328df3 ("net: qos: introduce a gate control flow action")
CC: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When neighbor suppression is enabled the bridge device might reply to
Neighbor Solicitation (NS) messages on behalf of remote hosts.
In case the NS message includes the "Source link-layer address" option
[1], the bridge device will use the specified address as the link-layer
destination address in its reply.
To avoid an infinite loop, break out of the options parsing loop when
encountering an option with length zero and disregard the NS message.
This is consistent with the IPv6 ndisc code and RFC 4886 which states
that "Nodes MUST silently discard an ND packet that contains an option
with length zero" [2].
[1] https://tools.ietf.org/html/rfc4861#section-4.3
[2] https://tools.ietf.org/html/rfc4861#section-4.6
Fixes: ed842faeb2 ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alla Segal <allas@mellanox.com>
Tested-by: Alla Segal <allas@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After allocating the spare nexthop group it should be tested for kzalloc()
returning NULL, instead the already used nexthop group (which cannot be
NULL at this point) had been tested so far.
Additionally, if kzalloc() fails, return ERR_PTR(-ENOMEM) instead of NULL.
Coverity-id: 1463885
Reported-by: Coverity <scan-admin@coverity.com>
Signed-off-by: Patrick Eigensatz <patrickeigensatz@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hedberg says:
====================
pull request: bluetooth-next 2020-06-01
Here's one last bluetooth-next pull request for 5.8, which I hope can
still be accepted.
- Enabled Wide-Band Speech (WBS) support for Qualcomm wcn3991
- Multiple fixes/imprvovements to Qualcomm-based devices
- Fix GAP/SEC/SEM/BI-10-C qualfication test case
- Added support for Broadcom BCM4350C5 device
- Several other smaller fixes & improvements
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
xdp_umem.c had overlapping changes between the 64-bit math fix
for the calculation of npgs and the removal of the zerocopy
memory type which got rid of the chunk_size_nohdr member.
The mlx5 Kconfig conflict is a case where we just take the
net-next copy of the Kconfig entry dependency as it takes on
the ESWITCH dependency by one level of indirection which is
what the 'net' conflicting change is trying to ensure.
Signed-off-by: David S. Miller <davem@davemloft.net>
* many 6 GHz changes, though it's not _quite_ complete
(I left out scanning for now, we're still discussing)
* allow userspace SA-query processing for operating channel
validation
* TX status for control port TX, for AP-side operation
* more per-STA/TID control options
* move to kHz for channels, for future S1G operation
* various other small changes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl7TfecACgkQB8qZga/f
l8Tx6hAAgRizfdHb9xxp001AAzKnsdU46srOKOhwV2d6w+S+qHbLtwa0Xz43pBvX
LxpQs7dBQBLYh11xJhDlKY6duYV989xGcsHm7suO43jbjDo8KXfz4MaP65em6EKt
pdD0mD1sKkfR4FhYNbUEe8Ug/185jdk+gX+aI1Nrz6XlkUoiY+czSnGFyAvpvau2
I+NGqyKG5D6ureq7p7dQcgN+t2D4Ou9stVhpQ+jP0Ep720gvfTEzeFuMJbb3JZ1y
KSgOOWS1HQj1FdlJDs3KAmgUXpkU/lxZhNxl06MMYo3tB7Y0vmLoy/ZNcb5eW4Sw
a0SHgG5yhDysCyINz6q7llG3esDcppGiNuMjd/qR2qPOZPHNtlYaHtcoKBcKdS0k
03DyURZpA0B33cr9FTV8tXaM7IMY/2qaq/DqkeNtuDzGdh4jEwkVJ4fNtUAdgcOv
4JEz3A7fY3isy8tzi7Dom4U/2hR1di5gZloAC5PPYRvnbmY9HoIqG06k1Wtn1Yj4
pbquqvdJ5ONcaAaXz7zVQUZm1JzrK81Pl3pdih7USasc8z2MEzWQPSR+hxtwG5TY
KbDI1Nel8ZLbL2MWDakh3+lPoJAMuyadRlVVWEMj4l/afYHgcy5hEbaMbaZnxmAg
G4I6R5JZTJZuVdKi/U/Q9n7jR83qfIRNbxMLY8HFZ4caJ5qhZGs=
=wdaG
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-davem-2020-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
Another set of changes, including
* many 6 GHz changes, though it's not _quite_ complete
(I left out scanning for now, we're still discussing)
* allow userspace SA-query processing for operating channel
validation
* TX status for control port TX, for AP-side operation
* more per-STA/TID control options
* move to kHz for channels, for future S1G operation
* various other small changes
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"Another week, another set of bug fixes:
1) Fix pskb_pull length in __xfrm_transport_prep(), from Xin Long.
2) Fix double xfrm_state put in esp{4,6}_gro_receive(), also from Xin
Long.
3) Re-arm discovery timer properly in mac80211 mesh code, from Linus
Lüssing.
4) Prevent buffer overflows in nf_conntrack_pptp debug code, from
Pablo Neira Ayuso.
5) Fix race in ktls code between tls_sw_recvmsg() and
tls_decrypt_done(), from Vinay Kumar Yadav.
6) Fix crashes on TCP fallback in MPTCP code, from Paolo Abeni.
7) More validation is necessary of untrusted GSO packets coming from
virtualization devices, from Willem de Bruijn.
8) Fix endianness of bnxt_en firmware message length accesses, from
Edwin Peer.
9) Fix infinite loop in sch_fq_pie, from Davide Caratti.
10) Fix lockdep splat in DSA by setting lockless TX in netdev features
for slave ports, from Vladimir Oltean.
11) Fix suspend/resume crashes in mlx5, from Mark Bloch.
12) Fix use after free in bpf fmod_ret, from Alexei Starovoitov.
13) ARP retransmit timer guard uses wrong offset, from Hongbin Liu.
14) Fix leak in inetdev_init(), from Yang Yingliang.
15) Don't try to use inet hash and unhash in l2tp code, results in
crashes. From Eric Dumazet"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
l2tp: add sk_family checks to l2tp_validate_socket
l2tp: do not use inet_hash()/inet_unhash()
net: qrtr: Allocate workqueue before kernel_bind
mptcp: remove msk from the token container at destruction time.
mptcp: fix race between MP_JOIN and close
mptcp: fix unblocking connect()
net/sched: act_ct: add nat mangle action only for NAT-conntrack
devinet: fix memleak in inetdev_init()
virtio_vsock: Fix race condition in virtio_transport_recv_pkt
drivers/net/ibmvnic: Update VNIC protocol version reporting
NFC: st21nfca: add missed kfree_skb() in an error path
neigh: fix ARP retransmit timer guard
bpf, selftests: Add a verifier test for assigning 32bit reg states to 64bit ones
bpf, selftests: Verifier bounds tests need to be updated
bpf: Fix a verifier issue when assigning 32bit reg states to 64bit ones
bpf: Fix use-after-free in fmod_ret check
net/mlx5e: replace EINVAL in mlx5e_flower_parse_meta()
net/mlx5e: Fix MLX5_TC_CT dependencies
net/mlx5e: Properly set default values when disabling adaptive moderation
net/mlx5e: Fix arch depending casting issue in FEC
...
The user ID value isn't actually much use - and leaks a kernel pointer or a
userspace value - so replace it with the call debug ID, which appears in trace
points.
Signed-off-by: David Howells <dhowells@redhat.com>
Map the EACCES error that is produced by some ICMP6 packets to EHOSTUNREACH
when we get them as EACCES has other meanings within a filesystem context.
Signed-off-by: David Howells <dhowells@redhat.com>
With some newer AKMs, the KCK and KEK are bigger, so allow that
if the driver advertises support for it. In addition, add a new
attribute for the AKM so we can use it for offloaded rekeying.
Signed-off-by: Nathan Errera <nathan.errera@intel.com>
[reword commit message]
Link: https://lore.kernel.org/r/20200528212237.5eb58b00a5d1.I61b09d77c4f382e8d58a05dcca78096e99a6bc15@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
If a 6 GHz channel exists, then we can probably safely assume that
the device actually supports it, and then it should support most
bandwidths.
This will probably need to be extended to check the interface type
and then dig into the HE capabilities for that though, to have the
correct bandwidth check.
Link: https://lore.kernel.org/r/20200528213443.d4864ef52e92.I82f09b2b14a56413ce20376d09967fe954a033eb@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
An AP supporting EMA (Enhanced Multi-BSSID advertisement) should set
bit 83 in the extended capabilities IE (9.4.2.26 in the 802.11ax D5 spec).
So the *3rd* bit of the 10th byte should be checked.
Also, in one place, the wrong byte was checked.
(cfg80211_find_ie returns a pointer to the beginning of the IE,
so the data really starts at ie[2], so the 10th byte
should be ie[12]. To avoid this confusion, use cfg80211_find_elem
instead).
Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Link: https://lore.kernel.org/r/20200528213443.4316121fa2a3.I9745582f8d41ad8e689dac0fefcd70b276d7c1ea@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
If the AP advertises inconsistent data, namely it has CCFS1 or CCFS2,
but doesn't advertise support for 160/80+80 bandwidth or "Extended NSS
BW Support", then we cannot use any MCSes in the the higher bandwidth.
Thus, avoid connecting with higher bandwidth since it's less efficient
that way.
Link: https://lore.kernel.org/r/20200528213443.0e55d40c3ccc.I6fd0b4708ebd087e5e46466c3e91f6efbcbef668@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
These capabilities cover what would otherwise be transported
in HT/VHT capabilities, but only a subset thereof that is
actually needed on 6 GHz with HE already present. Expose the
capabilities to userspace, drivers are expected to set them
as using the 6 GHz band (currently) requires HE capability.
Link: https://lore.kernel.org/r/20200528213443.244cd5cb9db8.Icd8c773277a88c837e7e3af1d4d1013cc3b66543@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The 6GHz band does not have regulatory approval yet, but things are
moving forward. However, that has led to a change in the channelization
of the 6GHz band which has been accepted in the 11ax specification. It
also fixes a missing MHZ_TO_KHZ() macro for 6GHz channels while at it.
This change is primarily thrown in to discuss how to deal with it.
I noticed ath11k adding 6G support with old channelization and ditto
for iw. It probably involves changes in hostapd as well.
Cc: Pradeep Kumar Chitrapu <pradeepc@codeaurora.org>
Cc: Jouni Malinen <jouni@w1.fi>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Link: https://lore.kernel.org/r/edf07cdd-ad15-4012-3afd-d8b961a80b69@broadcom.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
syzbot was able to trigger a crash after using an ISDN socket
and fool l2tp.
Fix this by making sure the UDP socket is of the proper family.
BUG: KASAN: slab-out-of-bounds in setup_udp_tunnel_sock+0x465/0x540 net/ipv4/udp_tunnel.c:78
Write of size 1 at addr ffff88808ed0c590 by task syz-executor.5/3018
CPU: 0 PID: 3018 Comm: syz-executor.5 Not tainted 5.7.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x188/0x20d lib/dump_stack.c:118
print_address_description.constprop.0.cold+0xd3/0x413 mm/kasan/report.c:382
__kasan_report.cold+0x20/0x38 mm/kasan/report.c:511
kasan_report+0x33/0x50 mm/kasan/common.c:625
setup_udp_tunnel_sock+0x465/0x540 net/ipv4/udp_tunnel.c:78
l2tp_tunnel_register+0xb15/0xdd0 net/l2tp/l2tp_core.c:1523
l2tp_nl_cmd_tunnel_create+0x4b2/0xa60 net/l2tp/l2tp_netlink.c:249
genl_family_rcv_msg_doit net/netlink/genetlink.c:673 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:718 [inline]
genl_rcv_msg+0x627/0xdf0 net/netlink/genetlink.c:735
netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2469
genl_rcv+0x24/0x40 net/netlink/genetlink.c:746
netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329
netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:672
____sys_sendmsg+0x6e6/0x810 net/socket.c:2352
___sys_sendmsg+0x100/0x170 net/socket.c:2406
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2439
do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x45ca29
Code: 0d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007effe76edc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004fe1c0 RCX: 000000000045ca29
RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000005
RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 000000000000094e R14: 00000000004d5d00 R15: 00007effe76ee6d4
Allocated by task 3018:
save_stack+0x1b/0x40 mm/kasan/common.c:49
set_track mm/kasan/common.c:57 [inline]
__kasan_kmalloc mm/kasan/common.c:495 [inline]
__kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:468
__do_kmalloc mm/slab.c:3656 [inline]
__kmalloc+0x161/0x7a0 mm/slab.c:3665
kmalloc include/linux/slab.h:560 [inline]
sk_prot_alloc+0x223/0x2f0 net/core/sock.c:1612
sk_alloc+0x36/0x1100 net/core/sock.c:1666
data_sock_create drivers/isdn/mISDN/socket.c:600 [inline]
mISDN_sock_create+0x272/0x400 drivers/isdn/mISDN/socket.c:796
__sock_create+0x3cb/0x730 net/socket.c:1428
sock_create net/socket.c:1479 [inline]
__sys_socket+0xef/0x200 net/socket.c:1521
__do_sys_socket net/socket.c:1530 [inline]
__se_sys_socket net/socket.c:1528 [inline]
__x64_sys_socket+0x6f/0xb0 net/socket.c:1528
do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
entry_SYSCALL_64_after_hwframe+0x49/0xb3
Freed by task 2484:
save_stack+0x1b/0x40 mm/kasan/common.c:49
set_track mm/kasan/common.c:57 [inline]
kasan_set_free_info mm/kasan/common.c:317 [inline]
__kasan_slab_free+0xf7/0x140 mm/kasan/common.c:456
__cache_free mm/slab.c:3426 [inline]
kfree+0x109/0x2b0 mm/slab.c:3757
kvfree+0x42/0x50 mm/util.c:603
__free_fdtable+0x2d/0x70 fs/file.c:31
put_files_struct fs/file.c:420 [inline]
put_files_struct+0x248/0x2e0 fs/file.c:413
exit_files+0x7e/0xa0 fs/file.c:445
do_exit+0xb04/0x2dd0 kernel/exit.c:791
do_group_exit+0x125/0x340 kernel/exit.c:894
get_signal+0x47b/0x24e0 kernel/signal.c:2739
do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784
exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161
prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
entry_SYSCALL_64_after_hwframe+0x49/0xb3
The buggy address belongs to the object at ffff88808ed0c000
which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 1424 bytes inside of
2048-byte region [ffff88808ed0c000, ffff88808ed0c800)
The buggy address belongs to the page:
page:ffffea00023b4300 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0xfffe0000000200(slab)
raw: 00fffe0000000200 ffffea0002838208 ffffea00015ba288 ffff8880aa000e00
raw: 0000000000000000 ffff88808ed0c000 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88808ed0c480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88808ed0c500: 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88808ed0c580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff88808ed0c600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88808ed0c680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Fixes: 6b9f34239b ("l2tp: fix races in tunnel creation")
Fixes: fd558d186d ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Cc: Guillaume Nault <gnault@redhat.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When token lookup on MP_JOIN 3rd ack fails, the server
socket closes with a reset the incoming child. Such socket
has the 'is_mptcp' flag set, but no msk socket associated
- due to the failed lookup.
While crafting the reset packet mptcp_established_options_mp()
will try to dereference the child's master socket, causing
a NULL ptr dereference.
This change addresses the issue with explicit fallback to
TCP in such error path.
Fixes: 729cd6436f ("mptcp: cope better with MP_JOIN failure")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While the other fq-based qdiscs take advantage of skb->hash and doesn't
recompute it if it is already set, sch_cake does not.
This was a deliberate choice because sch_cake hashes various parts of the
packet header to support its advanced flow isolation modes. However,
foregoing the use of skb->hash entirely loses a few important benefits:
- When skb->hash is set by hardware, a few CPU cycles can be saved by not
hashing again in software.
- Tunnel encapsulations will generally preserve the value of skb->hash from
before the encapsulation, which allows flow-based qdiscs to distinguish
between flows even though the outer packet header no longer has flow
information.
It turns out that we can preserve these desirable properties in many cases,
while still supporting the advanced flow isolation properties of sch_cake.
This patch does so by reusing the skb->hash value as the flow_hash part of
the hashing procedure in cake_hash() only in the following conditions:
- If the skb->hash is marked as covering the flow headers (skb->l4_hash is
set)
AND
- NAT header rewriting is either disabled, or did not change any values
used for hashing. The latter is important to match local-origin packets
such as those of a tunnel endpoint.
The immediate motivation for fixing this was the recent patch to WireGuard
to preserve the skb->hash on encapsulation. As such, this is also what I
tested against; with this patch, added latency under load for competing
flows drops from ~8 ms to sub-1ms on an RRUL test over a WireGuard tunnel
going through a virtual link shaped to 1Gbps using sch_cake. This matches
the results we saw with a similar setup using sch_fq_codel when testing the
WireGuard patch.
Fixes: 046f6fd5da ("sched: Add Common Applications Kept Enhanced (cake) qdisc")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A null pointer dereference in qrtr_ns_data_ready() is seen if a client
opens a qrtr socket before qrtr_ns_init() can bind to the control port.
When the control port is bound, the ENETRESET error will be broadcasted
and clients will close their sockets. This results in DEL_CLIENT
packets being sent to the ns and qrtr_ns_data_ready() being called
without the workqueue being allocated.
Allocate the workqueue before setting sk_data_ready and binding to the
control port. This ensures that the work and workqueue structs are
allocated and initialized before qrtr_ns_data_ready can be called.
Fixes: 0c2204a4ad ("net: qrtr: Migrate nameservice to kernel from userspace")
Signed-off-by: Chris Lew <clew@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we remote the msk from the token container only
via mptcp_close(). The MPTCP master socket can be destroyed
also via other paths (e.g. if not yet accepted, when shutting
down the listener socket). When we hit the latter scenario,
dangling msk references are left into the token container,
leading to memory corruption and/or UaF.
This change addresses the issue by moving the token removal
into the msk destructor.
Fixes: 79c0949e9a ("mptcp: Add key generation and token tree")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a MP_JOIN subflow completes the 3whs while another
CPU is closing the master msk, we can hit the
following race:
CPU1 CPU2
close()
mptcp_close
subflow_syn_recv_sock
mptcp_token_get_sock
mptcp_finish_join
inet_sk_state_load
mptcp_token_destroy
inet_sk_state_store(TCP_CLOSE)
__mptcp_flush_join_list()
mptcp_sock_graft
list_add_tail
sk_common_release
sock_orphan()
<socket free>
The MP_JOIN socket will be leaked. Additionally we can hit
UaF for the msk 'struct socket' referenced via the 'conn'
field.
This change try to address the issue introducing some
synchronization between the MP_JOIN 3whs and mptcp_close
via the join_list spinlock. If we detect the msk is closing
the MP_JOIN socket is closed, too.
Fixes: f296234c98 ("mptcp: Add handling of incoming MP_JOIN requests")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently unblocking connect() on MPTCP sockets fails frequently.
If mptcp_stream_connect() is invoked to complete a previously
attempted unblocking connection, it will still try to create
the first subflow via __mptcp_socket_create(). If the 3whs is
completed and the 'can_ack' flag is already set, the latter
will fail with -EINVAL.
This change addresses the issue checking for pending connect and
delegating the completion to the first subflow. Additionally
do msk addresses and sk_state changes only when needed.
Fixes: 2303f994b3 ("mptcp: Associate MPTCP context with TCP socket")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pre-fetch send buffer for the CDC validation message before entering the
send_lock. Without that the send call might fail with -EBUSY because
there are no free buffers and waiting for buffers is not possible under
send_lock.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently add nat mangle action with comparing invert and orig tuple.
It is better to check IPS_NAT_MASK flags first to avoid non necessary
memcmp for non-NAT conntrack.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Accumulated cleanup patches and sparse warning fixes for mlx5 driver.
1) sync with mlx5-next branch
2) Eli Cohen declares mpls_entry_encode() helper in mpls.h as suggested
by Jakub Kicinski and David Ahern, and use it in mlx5
3) Jesper Fixes xdp data_meta setup in mlx5
4) Many sparse and build warnings cleanup
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl7R3wcACgkQSD+KveBX
+j6/ZQf/QD39naPeImfLjemkRK9L+TKbS4nU6wpUwf1jC33Wdm4HhkhsWEnR6C4l
OwU/Pae3I9EtKP4gRE0W1o8h7zC9h4hY7+IKZOdyQ32iUY55PX/H25oqAiCj1NCM
xzWpXOTwK/vkqmkCedAd+YpNdYlbOhfycr+KVPSsvFdaPqjzfNO1PJcLsUbAbzrX
A+8pYdhUYTtx1N3YHJL5abLN6WzMAKxgwlm9GG8YCXACTJT6CBWWHGebVsC5TDUk
Lj5hJj38mI8/3dcu6vWP0kLGVfRZo0HS/gpPGxbKQFpP+1uBYaRENAQONxkY++6S
GDPix7ccvN+yNMlON893PC/Cogw3Yg==
=WaCJ
-----END PGP SIGNATURE-----
Merge tag 'mlx5-cleanup-2020-05-29' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-cleanup-2020-05-29
Accumulated cleanup patches and sparse warning fixes for mlx5 driver.
1) sync with mlx5-next branch
2) Eli Cohen declares mpls_entry_encode() helper in mpls.h as suggested
by Jakub Kicinski and David Ahern, and use it in mlx5
3) Jesper Fixes xdp data_meta setup in mlx5
4) Many sparse and build warnings cleanup
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When devinet_sysctl_register() failed, the memory allocated
in neigh_parms_alloc() should be freed.
Fixes: 20e61da7ff ("ipv4: fail early when creating netdev named all or default")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When client on the host tries to connect(SOCK_STREAM, O_NONBLOCK) to the
server on the guest, there will be a panic on a ThunderX2 (armv8a server):
[ 463.718844] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 463.718848] Mem abort info:
[ 463.718849] ESR = 0x96000044
[ 463.718852] EC = 0x25: DABT (current EL), IL = 32 bits
[ 463.718853] SET = 0, FnV = 0
[ 463.718854] EA = 0, S1PTW = 0
[ 463.718855] Data abort info:
[ 463.718856] ISV = 0, ISS = 0x00000044
[ 463.718857] CM = 0, WnR = 1
[ 463.718859] user pgtable: 4k pages, 48-bit VAs, pgdp=0000008f6f6e9000
[ 463.718861] [0000000000000000] pgd=0000000000000000
[ 463.718866] Internal error: Oops: 96000044 [#1] SMP
[...]
[ 463.718977] CPU: 213 PID: 5040 Comm: vhost-5032 Tainted: G O 5.7.0-rc7+ #139
[ 463.718980] Hardware name: GIGABYTE R281-T91-00/MT91-FS1-00, BIOS F06 09/25/2018
[ 463.718982] pstate: 60400009 (nZCv daif +PAN -UAO)
[ 463.718995] pc : virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
[ 463.718999] lr : virtio_transport_recv_pkt+0x1fc/0xd40 [vmw_vsock_virtio_transport_common]
[ 463.719000] sp : ffff80002dbe3c40
[...]
[ 463.719025] Call trace:
[ 463.719030] virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
[ 463.719034] vhost_vsock_handle_tx_kick+0x360/0x408 [vhost_vsock]
[ 463.719041] vhost_worker+0x100/0x1a0 [vhost]
[ 463.719048] kthread+0x128/0x130
[ 463.719052] ret_from_fork+0x10/0x18
The race condition is as follows:
Task1 Task2
===== =====
__sock_release virtio_transport_recv_pkt
__vsock_release vsock_find_bound_socket (found sk)
lock_sock_nested
vsock_remove_sock
sock_orphan
sk_set_socket(sk, NULL)
sk->sk_shutdown = SHUTDOWN_MASK
...
release_sock
lock_sock
virtio_transport_recv_connecting
sk->sk_socket->state (panic!)
The root cause is that vsock_find_bound_socket can't hold the lock_sock,
so there is a small race window between vsock_find_bound_socket() and
lock_sock(). If __vsock_release() is running in another task,
sk->sk_socket will be set to NULL inadvertently.
This fixes it by checking sk->sk_shutdown(suggested by Stefano) after
lock_sock since sk->sk_shutdown is set to SHUTDOWN_MASK under the
protection of lock_sock_nested.
Signed-off-by: Jia He <justin.he@arm.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move mpls_entry_encode() from net/mpls/internal.h to include/net/mpls.h
and make it available for other users. Specifically, hardware driver that
offload MPLS can benefit from that.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
As of commit 98fa6271cf
("tcp: refactor setting the initial congestion window") this is called
only from tcp_input.c, so it can be static.
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
kbuild test robot is reporting:
net/ethtool/cabletest.c:230:5: warning: no previous prototype for
Mark the function as static.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
net/tipc/msg.c: In function 'tipc_msg_append':
net/tipc/msg.c:215:24: warning:
variable 'prev' set but not used [-Wunused-but-set-variable]
commit 0a3e060f34 ("tipc: add test for Nagle algorithm effectiveness")
left behind this, remove it.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 19e16d220f ("neigh: support smaller retrans_time settting")
we add more accurate control for ARP and NS. But for ARP I forgot to
update the latest guard in neigh_timer_handler(), then the next
retransmit would be reset to jiffies + HZ/2 if we set the retrans_time
less than 500ms. Fix it by setting the time_before() check to HZ/100.
IPv6 does not have this issue.
Reported-by: Jianwen Ji <jiji@redhat.com>
Fixes: 19e16d220f ("neigh: support smaller retrans_time settting")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Right now, our only tag_8021q user, sja1105, has the ability to restore
bridge VLANs on its own, so this logic is unnecessary.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf 2020-05-29
The following pull-request contains BPF updates for your *net* tree.
We've added 6 non-merge commits during the last 7 day(s) which contain
a total of 4 files changed, 55 insertions(+), 34 deletions(-).
The main changes are:
1) minor verifier fix for fmod_ret progs, from Alexei.
2) af_xdp overflow check, from Bjorn.
3) minor verifier fix for 32bit assignment, from John.
4) powerpc has non-overlapping addr space, from Petr.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
No users left.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SCTP protocol allows to bind multiple address to a socket. That
feature is currently only exposed as a socket option. Add a bind_add
method struct proto that allows to bind additional addresses, and
switch the dlm code to use the method instead of going through the
socket option from kernel space.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Split out a sctp_setsockopt_bindx_kernel that takes a kernel pointer
to the sockaddr and make sctp_setsockopt_bindx a small wrapper around
it. This prepares for adding a new bind_add proto op.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2020-05-29
1) Several fixes for ESP gro/gso in transport and beet mode when
IPv6 extension headers are present. From Xin Long.
2) Fix a wrong comment on XFRMA_OFFLOAD_DEV.
From Antony Antony.
3) Fix sk_destruct callback handling on ESP in TCP encapsulation.
From Sabrina Dubroca.
4) Fix a use after free in xfrm_output_gso when used with vxlan.
From Xin Long.
5) Fix secpath handling of VTI when used wiuth IPCOMP.
From Xin Long.
6) Fix an oops when deleting a x-netns xfrm interface.
From Nicolas Dichtel.
7) Fix a possible warning on policy updates. We had a case where it was
possible to add two policies with the same lookup keys.
From Xin Long.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2020-05-29
1) Add IPv6 encapsulation support for ESP over UDP and TCP.
From Sabrina Dubroca.
2) Remove unneeded reference when initializing xfrm interfaces.
From Nicolas Dichtel.
3) Remove some indirect calls from the state_afinfo.
From Florian Westphal.
Please note that this pull request has two merge conflicts
between commit:
0c922a4850 ("xfrm: Always set XFRM_TRANSFORMED in xfrm{4,6}_output_finish")
from Linus' tree and commit:
2ab6096db2 ("xfrm: remove output_finish indirection from xfrm_state_afinfo")
from the ipsec-next tree.
and between commit:
3986912f6a ("ipv6: move SIOCADDRT and SIOCDELRT handling into ->compat_ioctl")
from the net-next tree and commit:
0146dca70b ("xfrm: add support for UDPv6 encapsulation of ESP")
from the ipsec-next tree.
Both conflicts can be resolved as done in linux-next.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
There was a lockdep which led to commit
fad003b6c8 ("Bluetooth: Fix inconsistent lock state with RFCOMM")
Lockdep noticed that `sk->sk_lock.slock' was acquired without disabling
the softirq while the lock was also used in softirq context.
Unfortunately the solution back then was to disable interrupts before
acquiring the lock which however made lockdep happy.
It would have been enough to simply disable the softirq. Disabling
interrupts before acquiring a spinlock_t is not allowed on PREEMPT_RT
because these locks are converted to 'sleeping' spinlocks.
Use spin_lock_bh() in order to acquire the `sk_lock.slock'.
Reported-by: Luis Claudio R. Goncalves <lclaudio@uudg.org>
Reported-by: kbuild test robot <lkp@intel.com> [missing unlock]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch is to fix a crash:
[ ] kasan: GPF could be caused by NULL-ptr deref or user memory access
[ ] general protection fault: 0000 [#1] SMP KASAN PTI
[ ] RIP: 0010:ipv6_local_error+0xac/0x7a0
[ ] Call Trace:
[ ] xfrm6_local_error+0x1eb/0x300
[ ] xfrm_local_error+0x95/0x130
[ ] __xfrm6_output+0x65f/0xb50
[ ] xfrm6_output+0x106/0x46f
[ ] udp_tunnel6_xmit_skb+0x618/0xbf0 [ip6_udp_tunnel]
[ ] vxlan_xmit_one+0xbc6/0x2c60 [vxlan]
[ ] vxlan_xmit+0x6a0/0x4276 [vxlan]
[ ] dev_hard_start_xmit+0x165/0x820
[ ] __dev_queue_xmit+0x1ff0/0x2b90
[ ] ip_finish_output2+0xd3e/0x1480
[ ] ip_do_fragment+0x182d/0x2210
[ ] ip_output+0x1d0/0x510
[ ] ip_send_skb+0x37/0xa0
[ ] raw_sendmsg+0x1b4c/0x2b80
[ ] sock_sendmsg+0xc0/0x110
This occurred when sending a v4 skb over vxlan6 over ipsec, in which case
skb->protocol == htons(ETH_P_IPV6) while skb->sk->sk_family == AF_INET in
xfrm_local_error(). Then it will go to xfrm6_local_error() where it tries
to get ipv6 info from a ipv4 sk.
This issue was actually fixed by Commit 628e341f31 ("xfrm: make local
error reporting more robust"), but brought back by Commit 844d48746e
("xfrm: choose protocol family by skb protocol").
So to fix it, we should call xfrm6_local_error() only when skb->protocol
is htons(ETH_P_IPV6) and skb->sk->sk_family is AF_INET6.
Fixes: 844d48746e ("xfrm: choose protocol family by skb protocol")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Make sure SCTP_ADDR_{MADE_PRIM,ADDED} are sent only for associations
that have been established.
These events are described in rfc6458#section-6.1
SCTP_PEER_ADDR_CHANGE:
This tag indicates that an address that is
part of an existing association has experienced a change of
state (e.g., a failure or return to service of the reachability
of an endpoint via a specific transport address).
Signed-off-by: Jonas Falkevik <jonas.falkevik@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid using kernel_setsockopt for the TIPC_IMPORTANCE option when we can
just use the internal helper. The only change needed is to pass a struct
sock instead of tipc_sock, which is private to socket.c
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the RXRPC_MIN_SECURITY_LEVEL sockopt from
kernel space without going through a fake uaccess.
Thanks to David Howells for the documentation updates.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the IPV6_RECVPKTINFO sockopt from kernel
space without going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the IPV6_ADD_PREFERENCES sockopt from kernel
space without going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the IPV6_RECVERR sockopt from kernel space
without going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the IPV6_V6ONLY sockopt from kernel space
without going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the IP_PKTINFO sockopt from kernel
space without going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the IP_MTU_DISCOVER sockopt from kernel
space without going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Howells <dhowells@redhat.com> [rxrpc bits]
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the IP_RECVERR sockopt from kernel space
without going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the IP_FREEBIND sockopt from kernel space
without going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the IP_TOS sockopt from kernel space without
going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the TCP_KEEPCNT sockopt from kernel space
without going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the TCP_KEEPINTVL sockopt from kernel space
without going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the TCP_KEEP_IDLE sockopt from kernel
space without going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the TCP_USER_TIMEOUT sockopt from kernel
space without going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the TCP_SYNCNT sockopt from kernel space
without going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the TCP_QUICKACK sockopt from kernel space
without going through a fake uaccess. Cleanup the callers to avoid
pointless wrappers now that this is a simple function call.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the TCP_NODELAY sockopt from kernel space
without going through a fake uaccess. Cleanup the callers to avoid
pointless wrappers now that this is a simple function call.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the TCP_CORK sockopt from kernel space
without going through a fake uaccess. Cleanup the callers to avoid
pointless wrappers now that this is a simple function call.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to directly set the SO_REUSEPORT sockopt from kernel space
without going through a fake uaccess.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>