Commit Graph

998733 Commits

Author SHA1 Message Date
Martin KaFai Lau 39cd9e0f67 bpf: selftests: Rename bictcp to bpf_cubic
As a similar chanage in the kernel, this patch gives the proper
name to the bpf cubic.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015240.1550074-1-kafai@fb.com
2021-03-26 20:41:51 -07:00
Martin KaFai Lau 5bd022ec01 libbpf: Support extern kernel function
This patch is to make libbpf able to handle the following extern
kernel function declaration and do the needed relocations before
loading the bpf program to the kernel.

extern int foo(struct sock *) __attribute__((section(".ksyms")))

In the collect extern phase, needed changes is made to
bpf_object__collect_externs() and find_extern_btf_id() to collect
extern function in ".ksyms" section.  The func in the BTF datasec also
needs to be replaced by an int var.  The idea is similar to the existing
handling in extern var.  In case the BTF may not have a var, a dummy ksym
var is added at the beginning of bpf_object__collect_externs()
if there is func under ksyms datasec.  It will also change the
func linkage from extern to global which the kernel can support.
It also assigns a param name if it does not have one.

In the collect relo phase, it will record the kernel function
call as RELO_EXTERN_FUNC.

bpf_object__resolve_ksym_func_btf_id() is added to find the func
btf_id of the running kernel.

During actual relocation, it will patch the BPF_CALL instruction with
src_reg = BPF_PSEUDO_FUNC_CALL and insn->imm set to the running
kernel func's btf_id.

The required LLVM patch: https://reviews.llvm.org/D93563

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015234.1548923-1-kafai@fb.com
2021-03-26 20:41:51 -07:00
Martin KaFai Lau aa0b8d43e9 libbpf: Record extern sym relocation first
This patch records the extern sym relocs first before recording
subprog relocs.  The later patch will have relocs for extern
kernel function call which is also using BPF_JMP | BPF_CALL.
It will be easier to handle the extern symbols first in
the later patch.

is_call_insn() helper is added.  The existing is_ldimm64() helper
is renamed to is_ldimm64_insn() for consistency.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015227.1548623-1-kafai@fb.com
2021-03-26 20:41:51 -07:00
Martin KaFai Lau 0c091e5c2d libbpf: Rename RELO_EXTERN to RELO_EXTERN_VAR
This patch renames RELO_EXTERN to RELO_EXTERN_VAR.
It is to avoid the confusion with a later patch adding
RELO_EXTERN_FUNC.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015221.1547722-1-kafai@fb.com
2021-03-26 20:41:51 -07:00
Martin KaFai Lau 774e132e83 libbpf: Refactor codes for finding btf id of a kernel symbol
This patch refactors code, that finds kernel btf_id by kind
and symbol name, to a new function find_ksym_btf_id().

It also adds a new helper __btf_kind_str() to return
a string by the numeric kind value.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015214.1547069-1-kafai@fb.com
2021-03-26 20:41:51 -07:00
Martin KaFai Lau 933d1aa324 libbpf: Refactor bpf_object__resolve_ksyms_btf_id
This patch refactors most of the logic from
bpf_object__resolve_ksyms_btf_id() into a new function
bpf_object__resolve_ksym_var_btf_id().
It is to get ready for a later patch adding
bpf_object__resolve_ksym_func_btf_id() which resolves
a kernel function to the running kernel btf_id.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015207.1546749-1-kafai@fb.com
2021-03-26 20:41:51 -07:00
Martin KaFai Lau e78aea8b21 bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc
This patch puts some tcp cong helper functions, tcp_slow_start()
and tcp_cong_avoid_ai(), into the allowlist for the bpf-tcp-cc
program.

A few tcp cc implementation functions are also put into the
allowlist.  A potential use case is the bpf-tcp-cc implementation
may only want to override a subset of a tcp_congestion_ops.  For others,
the bpf-tcp-cc can directly call the kernel counter parts instead of
re-implementing (or copy-and-pasting) them to the bpf program.

They will only be available to the bpf-tcp-cc typed program.
The allowlist functions are not bounded to a fixed ABI contract.
When any of them has changed, the bpf-tcp-cc program has to be changed
like any in-tree/out-of-tree kernel tcp-cc implementations do also.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015201.1546345-1-kafai@fb.com
2021-03-26 20:41:51 -07:00
Martin KaFai Lau d22f6ad187 tcp: Rename bictcp function prefix to cubictcp
The cubic functions in tcp_cubic.c are using the bictcp prefix as
in tcp_bic.c.  This patch gives it the proper name cubictcp
because the later patch will allow the bpf prog to directly
call the cubictcp implementation.  Renaming them will avoid
the name collision when trying to find the intended
one to call during bpf prog load time.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015155.1545532-1-kafai@fb.com
2021-03-26 20:41:51 -07:00
Martin KaFai Lau 797b84f727 bpf: Support kernel function call in x86-32
This patch adds kernel function call support to the x86-32 bpf jit.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015149.1545267-1-kafai@fb.com
2021-03-26 20:41:51 -07:00
Martin KaFai Lau e6ac2450d6 bpf: Support bpf program calling kernel function
This patch adds support to BPF verifier to allow bpf program calling
kernel function directly.

The use case included in this set is to allow bpf-tcp-cc to directly
call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()").  Those
functions have already been used by some kernel tcp-cc implementations.

This set will also allow the bpf-tcp-cc program to directly call the
kernel tcp-cc implementation,  For example, a bpf_dctcp may only want to
implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly
from the kernel tcp_dctcp.c instead of reimplementing (or
copy-and-pasting) them.

The tcp-cc kernel functions mentioned above will be white listed
for the struct_ops bpf-tcp-cc programs to use in a later patch.
The white listed functions are not bounded to a fixed ABI contract.
Those functions have already been used by the existing kernel tcp-cc.
If any of them has changed, both in-tree and out-of-tree kernel tcp-cc
implementations have to be changed.  The same goes for the struct_ops
bpf-tcp-cc programs which have to be adjusted accordingly.

This patch is to make the required changes in the bpf verifier.

First change is in btf.c, it adds a case in "btf_check_func_arg_match()".
When the passed in "btf->kernel_btf == true", it means matching the
verifier regs' states with a kernel function.  This will handle the
PTR_TO_BTF_ID reg.  It also maps PTR_TO_SOCK_COMMON, PTR_TO_SOCKET,
and PTR_TO_TCP_SOCK to its kernel's btf_id.

In the later libbpf patch, the insn calling a kernel function will
look like:

insn->code == (BPF_JMP | BPF_CALL)
insn->src_reg == BPF_PSEUDO_KFUNC_CALL /* <- new in this patch */
insn->imm == func_btf_id /* btf_id of the running kernel */

[ For the future calling function-in-kernel-module support, an array
  of module btf_fds can be passed at the load time and insn->off
  can be used to index into this array. ]

At the early stage of verifier, the verifier will collect all kernel
function calls into "struct bpf_kfunc_desc".  Those
descriptors are stored in "prog->aux->kfunc_tab" and will
be available to the JIT.  Since this "add" operation is similar
to the current "add_subprog()" and looking for the same insn->code,
they are done together in the new "add_subprog_and_kfunc()".

In the "do_check()" stage, the new "check_kfunc_call()" is added
to verify the kernel function call instruction:
1. Ensure the kernel function can be used by a particular BPF_PROG_TYPE.
   A new bpf_verifier_ops "check_kfunc_call" is added to do that.
   The bpf-tcp-cc struct_ops program will implement this function in
   a later patch.
2. Call "btf_check_kfunc_args_match()" to ensure the regs can be
   used as the args of a kernel function.
3. Mark the regs' type, subreg_def, and zext_dst.

At the later do_misc_fixups() stage, the new fixup_kfunc_call()
will replace the insn->imm with the function address (relative
to __bpf_call_base).  If needed, the jit can find the btf_func_model
by calling the new bpf_jit_find_kfunc_model(prog, insn).
With the imm set to the function address, "bpftool prog dump xlated"
will be able to display the kernel function calls the same way as
it displays other bpf helper calls.

gpl_compatible program is required to call kernel function.

This feature currently requires JIT.

The verifier selftests are adjusted because of the changes in
the verbose log in add_subprog_and_kfunc().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015142.1544736-1-kafai@fb.com
2021-03-26 20:41:51 -07:00
Martin KaFai Lau 34747c4120 bpf: Refactor btf_check_func_arg_match
This patch moved the subprog specific logic from
btf_check_func_arg_match() to the new btf_check_subprog_arg_match().
The core logic is left in btf_check_func_arg_match() which
will be reused later to check the kernel function call.

The "if (!btf_type_is_ptr(t))" is checked first to improve the
indentation which will be useful for a later patch.

Some of the "btf_kind_str[]" usages is replaced with the shortcut
"btf_type_str(t)".

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015136.1544504-1-kafai@fb.com
2021-03-26 20:41:50 -07:00
Martin KaFai Lau e16301fbe1 bpf: Simplify freeing logic in linfo and jited_linfo
This patch simplifies the linfo freeing logic by combining
"bpf_prog_free_jited_linfo()" and "bpf_prog_free_unused_jited_linfo()"
into the new "bpf_prog_jit_attempt_done()".
It is a prep work for the kernel function call support.  In a later
patch, freeing the kernel function call descriptors will also
be done in the "bpf_prog_jit_attempt_done()".

"bpf_prog_free_linfo()" is removed since it is only called by
"__bpf_prog_put_noref()".  The kvfree() are directly called
instead.

It also takes this chance to s/kcalloc/kvcalloc/ for the jited_linfo
allocation.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015130.1544323-1-kafai@fb.com
2021-03-26 20:41:50 -07:00
Liu Jian a1281601f8 farsync: use DEFINE_SPINLOCK() for spinlock
spinlock can be initialized automatically with DEFINE_SPINLOCK()
rather than explicitly calling spin_lock_init().

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:25:44 -07:00
David S. Miller c3c97fd0ca Merge branch 'llc-kdoc'
Yang Yingliang says:

====================
net: llc: Correct some function names in header

Fix some make W=1 kernel build warnings in net/llc/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:24:14 -07:00
Yang Yingliang 72e6afe6b4 net: llc: Correct function name llc_pdu_set_pf_bit() in header
Fix the following make W=1 kernel build warning:

 net/llc/llc_pdu.c:36: warning: expecting prototype for pdu_set_pf_bit(). Prototype was for llc_pdu_set_pf_bit() instead

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:24:14 -07:00
Yang Yingliang 8114f099d9 net: llc: Correct function name llc_sap_action_unitdata_ind() in header
Fix the following make W=1 kernel build warning:

  net/llc/llc_s_ac.c:38: warning: expecting prototype for llc_sap_action_unit_data_ind(). Prototype was for llc_sap_action_unitdata_ind() instead

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:24:14 -07:00
Yang Yingliang 26440a63a1 net: llc: Correct some function names in header
Fix the following make W=1 kernel build warning:

 net/llc/llc_c_ev.c:622: warning: expecting prototype for conn_ev_qlfy_last_frame_eq_1(). Prototype was for llc_conn_ev_qlfy_last_frame_eq_1() instead
 net/llc/llc_c_ev.c:636: warning: expecting prototype for conn_ev_qlfy_last_frame_eq_0(). Prototype was for llc_conn_ev_qlfy_last_frame_eq_0() instead

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:24:14 -07:00
Hoang Le bc556d3edd tipc: fix kernel-doc warnings
Fix kernel-doc warning introduced in
commit b83e214b2e ("tipc: add extack messages for bearer/media failure"):

net/tipc/bearer.c:248: warning: Function parameter or member 'extack' not described in 'tipc_enable_bearer'

Fixes: b83e214b2e ("tipc: add extack messages for bearer/media failure")
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:22:29 -07:00
Mohammad Athari Bin Ismail 63c173ff7a net: stmmac: Fix kernel panic due to NULL pointer dereference of fpe_cfg
In this patch, "net: stmmac: support FPE link partner hand-shaking
procedure", priv->plat->fpe_cfg wouldn`t be "devm_kzalloc"ed if
dma_cap->frpsel is 0 (Flexible Rx Parser is not supported in SoC) in
tc_init(). So, fpe_cfg will be remain as NULL and accessing it will cause
kernel panic.

To fix this, move the "devm_kzalloc"ing of priv->plat->fpe_cfg before
dma_cap->frpsel checking in tc_init(). Additionally, checking of
priv->dma_cap.fpesel is added before calling stmmac_fpe_link_state_handle()
as only FPE supported SoC is allowed to call the function.

Below is the kernel panic dump reported by Marek Szyprowski
<m.szyprowski@samsung.com>:

meson8b-dwmac ff3f0000.ethernet eth0: PHY [0.0:00] driver [RTL8211F Gigabit Ethernet] (irq=35)
meson8b-dwmac ff3f0000.ethernet eth0: No Safety Features support found
meson8b-dwmac ff3f0000.ethernet eth0: PTP not supported by HW
meson8b-dwmac ff3f0000.ethernet eth0: configuring for phy/rgmii link mode
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000001
Mem abort info:
...
user pgtable: 4k pages, 48-bit VAs, pgdp=00000000044eb000
[0000000000000001] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
Modules linked in: dw_hdmi_i2s_audio dw_hdmi_cec meson_gxl realtek meson_gxbb_wdt snd_soc_meson_axg_sound_card dwmac_generic axg_audio meson_dw_hdmi crct10dif_ce snd_soc_meson_card_utils snd_soc_meson_axg_tdmout panfrost rc_odroid gpu_sched reset_meson_audio_arb meson_ir snd_soc_meson_g12a_tohdmitx snd_soc_meson_axg_frddr sclk_div clk_phase snd_soc_meson_codec_glue dwmac_meson8b snd_soc_meson_axg_fifo stmmac_platform meson_rng meson_drm stmmac rtc_meson_vrtc rng_core meson_canvas pwm_meson dw_hdmi mdio_mux_meson_g12a pcs_xpcs snd_soc_meson_axg_tdm_interface snd_soc_meson_axg_tdm_formatter nvmem_meson_efuse display_connector
CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.12.0-rc4-next-20210325+
Hardware name: Hardkernel ODROID-C4 (DT)
Workqueue: events_power_efficient phylink_resolve
pstate: 20400009 (nzCv daif +PAN -UAO -TCO BTYPE=--)
pc : stmmac_mac_link_up+0x14c/0x348 [stmmac]
lr : stmmac_mac_link_up+0x284/0x348 [stmmac] ...
Call trace:
 stmmac_mac_link_up+0x14c/0x348 [stmmac]
 phylink_resolve+0x104/0x420
 process_one_work+0x2a8/0x718
 worker_thread+0x48/0x460
 kthread+0x134/0x160
 ret_from_fork+0x10/0x18
Code: b971ba60 350007c0 f958c260 f9402000 (39400401)
---[ end trace 0c9deb6c510228aa ]---

Fixes: 5a5586112b ("net: stmmac: support FPE link partner hand-shaking
procedure")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:21:51 -07:00
Xu Jia aeab5cfbc8 net: ethernet: remove duplicated include
Remove duplicated include from mtk_ppe_offload.c.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xu Jia <xujia39@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:18:50 -07:00
David S. Miller 4e6d698f86 Merge branch 'axienet-clock-additions'
Robert Hancock says:

====================
axienet clock additions

Add support to the axienet driver for controlling all of the clocks that
the logic core may utilize.

Changed since v3:
-Added Acked-by to patch 1
-Now applies to net-next tree after earlier patches merged in - code
unchanged from v3

Changed since v2:
-Additional clock description clarification

Changed since v1:
-Clarified clock usages in documentation and code comments
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:17:17 -07:00
Robert Hancock b11bfb9a19 net: axienet: Enable more clocks
This driver was only enabling the first clock on the device, regardless
of its name. However, this controller logic can have multiple clocks
which should all be enabled. Add support for enabling additional clocks.
The clock names used are matching those used in the Xilinx version of this
driver as well as the Xilinx device tree generator, except for mgt_clk
which is not present there.

For backward compatibility, if no named clocks are present, the first
clock present is used for determining the MDIO bus clock divider.

Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:17:17 -07:00
Robert Hancock a0e55dcd2f dt-bindings: net: xilinx_axienet: Document additional clocks
Update DT bindings to describe all of the clocks that the axienet
driver will now be able to make use of.

Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:17:17 -07:00
David S. Miller 32bc7a2cca Merge branch 'mld-sleepable'
Taehee Yoo says:

====================
mld: change context from atomic to sleepable

This patchset changes the context of MLD module.
Before this patchset, MLD functions are atomic context so it couldn't use
sleepable functions and flags.

There are several reasons why MLD functions are under atomic context.
1. It uses timer API.
Timer expiration functions are executed in the atomic context.
2. atomic locks
MLD functions use rwlock and spinlock to protect their own resources.

So, in order to switch context, this patchset converts resources to use
RCU and removes atomic locks and timer API.

1. The first patch convert from the timer API to delayed work.
Timer API is used for delaying some works.
MLD protocol has a delay mechanism, which is used for replying to a query.
If a listener receives a query from a router, it should send a response
after some delay. But because of timer expire function is executed in
the atomic context, this patch convert from timer API to the delayed work.

2. The fourth patch deletes inet6_dev->mc_lock.
The mc_lock has protected inet6_dev->mc_tomb pointer.
But this pointer is already protected by RTNL and it isn't be used by
datapath. So, it isn't be needed and because of this, many atomic context
critical sections are deleted.

3. The fifth patch convert ip6_sf_socklist to RCU.
ip6_sf_socklist has been protected by ipv6_mc_socklist->sflock(rwlock).
But this is already protected by RTNL So if it is converted to use RCU
in order to be used in the datapath, the sflock is no more needed.
So, its control path context can be switched to sleepable.

4. The sixth patch convert ip6_sf_list to RCU.
The reason for this patch is the same as the previous patch.

5. The seventh patch convert ifmcaddr6 to RCU.
The reason for this patch is the same as the previous patch.

6. Add new workqueues for processing query/report event.
By this patch, query and report events are processed by workqueue
So context is sleepable, not atomic.
While this logic, it acquires RTNL.

7. Add new mc_lock.
The purpose of this lock is to protect per-interface mld data.
Per-interface mld data is usually used by query/report event handler.
So, query/report event workers need only this lock instead of RTNL.
Therefore, it could reduce bottleneck.

Changelog:
v2 -> v3:
1. Do not use msecs_to_jiffies().
(by Cong Wang)
2. Do not add unnecessary rtnl_lock() and rtnl_unlock().
(by Cong Wang)
3. Fix sparse warnings because of rcu annotation.
(by kernel test robot)
   - Remove some rcu_assign_pointer(), which was used for non-rcu pointer.
   - Add union for rcu pointer.
   - Use rcu API in mld_clear_zeros().
   - Remove remained rcu_read_unlock().
   - Use rcu API for tomb resources.
4. withdraw prevopus 2nd and 3rd patch.
   - "separate two flags from ifmcaddr6->mca_flags"
   - "add a new delayed_work, mc_delrec_work"
5. Add 6th and 7th patch.

v1 -> v2:
1. Withdraw unnecessary refactoring patches.
(by Cong Wang, Eric Dumazet, David Ahern)
    a) convert from array to list.
    b) function rename.
2. Separate big one patch into small several patches.
3. Do not rename 'ifmcaddr6->mca_lock'.
In the v1 patch, this variable was changed to 'ifmcaddr6->mca_work_lock'.
But this is actually not needed.
4. Do not use atomic_t for 'ifmcaddr6->mca_sfcount' and
'ipv6_mc_socklist'->sf_count'.
5. Do not add mld_check_leave_group() function.
6. Do not add ip6_mc_del_src_bulk() function.
7. Do not add ip6_mc_add_src_bulk() function.
8. Do not use rcu_read_lock() in the qeth_l3_add_mcast_rtnl().
(by Julian Wiedmann)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:14:57 -07:00
Taehee Yoo 63ed8de4be mld: add mc_lock for protecting per-interface mld data
The purpose of this lock is to avoid a bottleneck in the query/report
event handler logic.

By previous patches, almost all mld data is protected by RTNL.
So, the query and report event handler, which is data path logic
acquires RTNL too. Therefore if a lot of query and report events
are received, it uses RTNL for a long time.
So it makes the control-plane bottleneck because of using RTNL.
In order to avoid this bottleneck, mc_lock is added.

mc_lock protect only per-interface mld data and per-interface mld
data is used in the query/report event handler logic.
So, no longer rtnl_lock is needed in the query/report event handler logic.
Therefore bottleneck will be disappeared by mc_lock.

Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:14:56 -07:00
Taehee Yoo f185de28d9 mld: add new workqueues for process mld events
When query/report packets are received, mld module processes them.
But they are processed under BH context so it couldn't use sleepable
functions. So, in order to switch context, the two workqueues are
added which processes query and report event.

In the struct inet6_dev, mc_{query | report}_queue are added so it
is per-interface queue.
And mc_{query | report}_work are workqueue structure.

When the query or report event is received, skb is queued to proper
queue and worker function is scheduled immediately.
Workqueues and queues are protected by spinlock, which is
mc_{query | report}_lock, and worker functions are protected by RTNL.

Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:14:56 -07:00
Taehee Yoo 88e2ca3080 mld: convert ifmcaddr6 to RCU
The ifmcaddr6 has been protected by inet6_dev->lock(rwlock) so that
the critical section is atomic context. In order to switch this context,
changing locking is needed. The ifmcaddr6 actually already protected by
RTNL So if it's converted to use RCU, its control path context can be
switched to sleepable.

Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:14:56 -07:00
Taehee Yoo 4b200e3989 mld: convert ip6_sf_list to RCU
The ip6_sf_list has been protected by mca_lock(spin_lock) so that the
critical section is atomic context. In order to switch this context,
changing locking is needed. The ip6_sf_list actually already protected
by RTNL So if it's converted to use RCU, its control path context can
be switched to sleepable.
But It doesn't remove mca_lock yet because ifmcaddr6 isn't converted
to RCU yet. So, It's not fully converted to the sleepable context.

Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:14:56 -07:00
Taehee Yoo 882ba1f73c mld: convert ipv6_mc_socklist->sflist to RCU
The sflist has been protected by rwlock so that the critical section
is atomic context.
In order to switch this context, changing locking is needed.
The sflist actually already protected by RTNL So if it's converted
to use RCU, its control path context can be switched to sleepable.

Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:14:56 -07:00
Taehee Yoo cf2ce339b4 mld: get rid of inet6_dev->mc_lock
The purpose of mc_lock is to protect inet6_dev->mc_tomb.
But mc_tomb is already protected by RTNL and all functions,
which manipulate mc_tomb are called under RTNL.
So, mc_lock is not needed.
Furthermore, it is spinlock so the critical section is atomic.
In order to reduce atomic context, it should be removed.

Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:14:55 -07:00
Taehee Yoo 2d9a93b490 mld: convert from timer to delayed work
mcast.c has several timers for delaying works.
Timer's expire handler is working under atomic context so it can't use
sleepable things such as GFP_KERNEL, mutex, etc.
In order to use sleepable APIs, it converts from timers to delayed work.
But there are some critical sections, which is used by both process
and BH context. So that it still uses spin_lock_bh() and rwlock.

Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:14:55 -07:00
David S. Miller 6e27514334 Merge branch 'ethtool-kdoc-touchups'
Jakub Kicinski says:

====================
ethtool: fec: ioctl kdoc touch ups

A few touch ups from v1 review.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:09:45 -07:00
Jakub Kicinski d04feecaf1 ethtool: document the enum values not defines
kdoc does not have good support for documenting defines,
and we can't abuse the enum documentation because it
generates warnings.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:09:45 -07:00
Jakub Kicinski cf2cc0bf4f ethtool: fec: fix FEC_NONE check
Dan points out we need to use the mask not the bit (which is 0).

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 42ce127d98 ("ethtool: fec: sanitize ethtool_fecparam->fec")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:09:45 -07:00
Jakub Kicinski ad1cd7856d ethtool: fec: add note about reuse of reserved
struct ethtool_fecparam::reserved can't be used in SET, because
ethtool user space doesn't zero-initialize the structure.
Make this clear.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:09:45 -07:00
David S. Miller f59798b8f6 Merge branch 'mptcp-cleanups'
Mat Martineau says:

====================
MPTCP: Cleanup and address advertisement fixes

This patch series contains cleanup and fixes we have been testing in the
MPTCP tree. MPTCP uses TCP option headers to advertise additional
address information after an initial connection is established. The main
fixes here deal with making those advertisements more reliable and
improving the way subflows are created after an advertisement is
received.

Patches 1, 2, 4, 10, and 12 are for various cleanup or refactoring.

Patch 3 skips an extra connection attempt if there's already a subflow
connection for the newly received advertisement.

Patches 5, 6, and 7 make sure that the next address is advertised when
there are multiple addresses to share, the advertisement has been
retried, and the peer has not echoed the advertisement. Self tests are
updated.

Patches 8 and 9 fix a problem similar to 5/6/7, but covers a case where
the failure was due to a subflow connection not completing.

Patches 11 and 13 send a bare ack to revoke an advertisement rather than
waiting for other activity to trigger a packet send. This mirrors the
way acks are sent for new advertisements. Self test is included.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:05:15 -07:00
Geliang Tang ef360019db selftests: mptcp: signal addresses testcases
This patch adds testcases for signalling multi valid and invalid
addresses for both signal_address_tests and remove_tests.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:05:15 -07:00
Geliang Tang b46a023810 mptcp: rename mptcp_pm_nl_add_addr_send_ack
Since mptcp_pm_nl_add_addr_send_ack is now used for both ADD_ADDR and
RM_ADDR cases, rename it to mptcp_pm_nl_addr_send_ack.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:05:15 -07:00
Geliang Tang 8dd5efb1f9 mptcp: send ack for rm_addr
This patch changes the sending ACK conditions for the ADD_ADDR, send an
ACK packet for RM_ADDR too.

In mptcp_pm_remove_addr, invoke mptcp_pm_nl_add_addr_send_ack to send
the ACK packet.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:05:15 -07:00
Geliang Tang b65d95adb8 mptcp: drop useless addr_signal clear
msk->pm.addr_signal is cleared in mptcp_pm_add_addr_signal, no need to
clear it in mptcp_pm_nl_add_addr_send_ack again. Drop it.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:05:15 -07:00
Geliang Tang 557963c383 mptcp: move to next addr when subflow creation fail
When an invalid address was announced, the subflow couldn't be created
for this address. Therefore mptcp_pm_nl_subflow_established couldn't be
invoked. Then the next addresses in the local address list didn't have a
chance to be announced.

This patch invokes the new function mptcp_pm_add_addr_echoed when the
address is echoed. In it, use mptcp_lookup_anno_list_by_saddr to check
whether this address is in the anno_list. If it is, PM schedules the
status MPTCP_PM_SUBFLOW_ESTABLISHED to invoke
mptcp_pm_create_subflow_or_signal_addr to deal with the next address in
the local address list.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:05:15 -07:00
Geliang Tang d88c476f4a mptcp: export lookup_anno_list_by_saddr
This patch exported the static function lookup_anno_list_by_saddr, and
renamed it to mptcp_lookup_anno_list_by_saddr.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:05:15 -07:00
Geliang Tang 8da6229b95 selftests: mptcp: timeout testcases for multi addresses
This patch added the timeout testcases for multi addresses, valid and
invalid.

These testcases need to transmit 8 ADD_ADDRs, so add a new speed level
'least' to set 10 to mptcp_connect to slow down the transmitting process.
The original speed level 'slow' still uses 50.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:05:15 -07:00
Geliang Tang 2e580a63b5 selftests: mptcp: add cfg_do_w for cfg_remove
In some testcases, we need to slow down the transmitting process. This
patch added a new argument named cfg_do_w for cfg_remove to allow the
caller to pass an argument to cfg_remove.

In do_rnd_write, use this cfg_do_w to control the transmitting speed.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:05:15 -07:00
Geliang Tang 348d5c1dec mptcp: move to next addr when timeout
This patch called mptcp_pm_subflow_established to move to the next address
when an ADD_ADDR has been retransmitted the maximum number of times.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:05:15 -07:00
Geliang Tang 62535200be mptcp: drop unused subflow in mptcp_pm_subflow_established
This patch drops the unused parameter subflow in
mptcp_pm_subflow_established().

Fixes: 926bdeab55 ("mptcp: Implement path manager interface commands")
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:05:15 -07:00
Geliang Tang d84ad04941 mptcp: skip connecting the connected address
This patch added a new helper named lookup_subflow_by_daddr to find
whether the destination address is in the msk's conn_list.

In mptcp_pm_nl_add_addr_received, use lookup_subflow_by_daddr to check
whether the announced address is already connected. If it is, skip
connecting this address and send out the echo.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:05:15 -07:00
Geliang Tang f7efc7771e mptcp: drop argument port from mptcp_pm_announce_addr
Drop the redundant argument 'port' from mptcp_pm_announce_addr, use the
port field of another argument 'addr' instead.

Fixes: 0f5c9e3f07 ("mptcp: add port parameter for mptcp_pm_announce_addr")
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:05:15 -07:00
Paolo Abeni 2d6f5a2b57 mptcp: clean-up the rtx path
After the previous patch we can easily avoid invoking
the workqueue to perform the retransmission, if the
msk socket lock is held at rtx timer expiration.

This also simplifies the relevant code.

Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:05:15 -07:00
David S. Miller 6cb502a368 Merge branch 'ipa-resource'
Alex Elder says:

====================
net: ipa: rework resource programming

This series reworks the way IPA resources are defined and
programmed.  It is a little long--and I apologize for that--but
I think the patches are best taken together as a single unit.

The IPA hardware operates with a set of distinct "resources."  Each
hardware instance has a fixed number of each resource type available.
Available resources are divided into smaller pools, with each pool
shared by endpoints in a "resource group."  Each endpoint is thus
assigned to a resource group that determines which pools supply
resources the IPA hardware uses to handle the endpoint's processing.

The exact set of resources used can differ for each version of IPA.
Except for IPA v3.0 and v3.1, there are 5 source and 2 destination
resource types, but there's no reason to assume this won't change.

The number of resource groups used *does* typically change based on
the hardware version.  For example, some versions target reduced
functionality and support fewer resource groups.

With that as background...

The net result of this series is to improve the flexibility with
which IPA resources and resource groups are defined, permitting each
version of IPA to define its own set of resources and groups.  Along
the way it isolates the resource-related code, and fixes a few bugs
related to resource handling.

The first patch moves resource-related code to a new C file (and
header).  It generates a checkpatch warning about updating
MAINTAINERS, which can be ignored.  The second patch fixes a bug,
but the bug does not affect SDM845 or SC7180.

The third patch defines an enumerated type whose members provide
symbolic names for resource groups.

The fourth defines some resource limits for SDM845 that were not
previously being programmed.  That platform "works" without this,
but to be correct, these limits should really be programmed.

The fifth patch uses a single enumerated type to define both source
and destination resource type IDs, and the sixth uses those IDs to
index the resource limit arrays.  The seventh moves the definition
of that enumerated type into the platform data files, allowing each
platform to define its own set of resource types.

The eighth and ninth are fairly trivial changes.  One replaces two
"max" symbols having the same value with a single symbol.  And the
other replaces two distinct but otherwise identical structure types
with a single common one.

The 10th is a small preparatory patch for the 11th, passing a
different argument to a function that programs resource values.
The 11th allows the actual number of source and destination resource
groups for a platform to be specified in its configuration data.
That way the number is based on the actual number of groups defined.
This removes the need for a sort of clunky pair of functions that
defined that information previously.

Finally, the last patch just increases the number of resource groups
that can be defined to 8.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-26 15:02:39 -07:00