linux

Commit Graph

Author	SHA1	Message	Date
Roman Gushchin	25025e0aab	bpf: sync include/uapi/linux/bpf.h to tools/include/uapi/linux/bpf.h The sync is required due to the appearance of a new map type: BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE, which implements per-cpu cgroup local storage. Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-01 16:18:33 +02:00
Roman Gushchin	c6fdcd6e0c	bpf: don't allow create maps of per-cpu cgroup local storages Explicitly forbid creating map of per-cpu cgroup local storages. This behavior matches the behavior of shared cgroup storages. Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-01 16:18:33 +02:00
Roman Gushchin	b741f16303	bpf: introduce per-cpu cgroup local storage This commit introduced per-cpu cgroup local storage. Per-cpu cgroup local storage is very similar to simple cgroup storage (let's call it shared), except all the data is per-cpu. The main goal of per-cpu variant is to implement super fast counters (e.g. packet counters), which don't require neither lookups, neither atomic operations. >From userspace's point of view, accessing a per-cpu cgroup storage is similar to other per-cpu map types (e.g. per-cpu hashmaps and arrays). Writing to a per-cpu cgroup storage is not atomic, but is performed by copying longs, so some minimal atomicity is here, exactly as with other per-cpu maps. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-01 16:18:32 +02:00
Roman Gushchin	f294b37ec7	bpf: rework cgroup storage pointer passing To simplify the following introduction of per-cpu cgroup storage, let's rework a bit a mechanism of passing a pointer to a cgroup storage into the bpf_get_local_storage(). Let's save a pointer to the corresponding bpf_cgroup_storage structure, instead of a pointer to the actual buffer. It will help us to handle per-cpu storage later, which has a different way of accessing to the actual data. Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-01 16:18:32 +02:00
Roman Gushchin	8bad74f984	bpf: extend cgroup bpf core to allow multiple cgroup storage types In order to introduce per-cpu cgroup storage, let's generalize bpf cgroup core to support multiple cgroup storage types. Potentially, per-node cgroup storage can be added later. This commit is mostly a formal change that replaces cgroup_storage pointer with a array of cgroup_storage pointers. It doesn't actually introduce a new storage type, it will be done later. Each bpf program is now able to have one cgroup storage of each type. Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-10-01 16:18:32 +02:00
Yonghong Song	5bf7a60b8e	bpf: permit CGROUP_DEVICE programs accessing helper bpf_get_current_cgroup_id() Currently, helper bpf_get_current_cgroup_id() is not permitted for CGROUP_DEVICE type of programs. If the helper is used in such cases, the verifier will log the following error: 0: (bf) r6 = r1 1: (69) r7 = (u16 )(r6 +0) 2: (85) call bpf_get_current_cgroup_id#80 unknown func bpf_get_current_cgroup_id#80 The bpf_get_current_cgroup_id() is useful for CGROUP_DEVICE type of programs in order to customize action based on cgroup id. This patch added such a support. Cc: Roman Gushchin <guro@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Roman Gushchin <guro@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-28 14:15:19 +02:00
Daniel Borkmann	78e6e5c11a	Merge branch 'bpf-libbpf-attach-by-name' Andrey Ignatov says: ==================== This patch set introduces libbpf_attach_type_by_name function in libbpf to identify attach type by section name. This is useful to avoid writing same logic over and over again in user space applications that leverage libbpf. Patch 1 has more details on the new function and problem being solved. Patches 2 and 3 add support for new section names. Patch 4 uses new function in a selftest. Patch 5 adds selftest for libbpf_{prog,attach}_type_by_name. As a side note there are a lot of inconsistencies now between names used by libbpf and bpftool (e.g. cgroup/skb vs cgroup_skb, cgroup_device and device vs cgroup/dev, sockops vs sock_ops, etc). This patch set does not address it but it tries not to make it harder to address it in the future. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-27 21:19:34 +02:00
Andrey Ignatov	370920c47b	selftests/bpf: Test libbpf_{prog,attach}_type_by_name Add selftest for libbpf functions libbpf_prog_type_by_name and libbpf_attach_type_by_name. Example of output: % ./tools/testing/selftests/bpf/test_section_names Summary: 35 PASSED, 0 FAILED Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-27 21:14:59 +02:00
Andrey Ignatov	c9bf507d0a	selftests/bpf: Use libbpf_attach_type_by_name in test_socket_cookie Use newly introduced libbpf_attach_type_by_name in test_socket_cookie selftest. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-27 21:14:59 +02:00
Andrey Ignatov	c6f6851b28	libbpf: Support sk_skb/stream_{parser, verdict} section names Add section names for BPF_SK_SKB_STREAM_PARSER and BPF_SK_SKB_STREAM_VERDICT attach types to be able to identify them in libbpf_attach_type_by_name. "stream_parser" and "stream_verdict" are used instead of simple "parser" and "verdict" just to avoid possible confusion in a place where attach type is used alone (e.g. in bpftool's show sub-commands) since there is another attach point that can be named as "verdict": BPF_SK_MSG_VERDICT. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-27 21:14:59 +02:00
Andrey Ignatov	bafa7afe63	libbpf: Support cgroup_skb/{e,in}gress section names Add section names for BPF_CGROUP_INET_INGRESS and BPF_CGROUP_INET_EGRESS attach types to be able to identify them in libbpf_attach_type_by_name. "cgroup_skb" is used instead of "cgroup/skb" mostly to easy possible unifying of how libbpf and bpftool works with section names: * bpftool uses "cgroup_skb" to in "prog list" sub-command; * bpftool uses "ingress" and "egress" in "cgroup list" sub-command; * having two parts instead of three in a string like "cgroup_skb/ingress" can be leveraged to split it to prog_type part and attach_type part, or vise versa: use two parts to make a section name. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-27 21:14:59 +02:00
Andrey Ignatov	956b620fcf	libbpf: Introduce libbpf_attach_type_by_name There is a common use-case when ELF object contains multiple BPF programs and every program has its own section name. If it's cgroup-bpf then programs have to be 1) loaded and 2) attached to a cgroup. It's convenient to have information necessary to load BPF program together with program itself. This is where section name works fine in conjunction with libbpf_prog_type_by_name that identifies prog_type and expected_attach_type and these can be used with BPF_PROG_LOAD. But there is currently no way to identify attach_type by section name and it leads to messy code in user space that reinvents guessing logic every time it has to identify attach type to use with BPF_PROG_ATTACH. The patch introduces libbpf_attach_type_by_name that guesses attach type by section name if a program can be attached. The difference between expected_attach_type provided by libbpf_prog_type_by_name and attach_type provided by libbpf_attach_type_by_name is the former is used at BPF_PROG_LOAD time and can be zero if a program of prog_type X has only one corresponding attach type Y whether the latter provides specific attach type to use with BPF_PROG_ATTACH. No new section names were added to section_names array. Only existing ones were reorganized and attach_type was added where appropriate. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-27 21:14:59 +02:00
Song Liu	100811936f	bpf: test_bpf: add init_net to dev for flow_dissector Latest changes in __skb_flow_dissect() assume skb->dev has valid nd_net. However, this is not true for test_bpf. As a result, test_bpf.ko crashes the system with the following stack trace: [ 1133.716622] BUG: unable to handle kernel paging request at 0000000000001030 [ 1133.716623] PGD 8000001fbf7ee067 [ 1133.716624] P4D 8000001fbf7ee067 [ 1133.716624] PUD 1f6c1cf067 [ 1133.716625] PMD 0 [ 1133.716628] Oops: 0000 [#1] SMP PTI [ 1133.716630] CPU: 7 PID: 40473 Comm: modprobe Kdump: loaded Not tainted 4.19.0-rc5-00805-gca11cc92ccd2 #1167 [ 1133.716631] Hardware name: Wiwynn Leopard-Orv2/Leopard-DDR BW, BIOS LBM12.5 12/06/2017 [ 1133.716638] RIP: 0010:__skb_flow_dissect+0x83/0x1680 [ 1133.716639] Code: 04 00 00 41 0f b7 44 24 04 48 85 db 4d 8d 14 07 0f 84 01 02 00 00 48 8b 43 10 48 85 c0 0f 84 e5 01 00 00 48 8b 80 a8 04 00 00 <48> 8b 90 30 10 00 00 48 85 d2 0f 84 dd 01 00 00 31 c0 b9 05 00 00 [ 1133.716640] RSP: 0018:ffffc900303c7a80 EFLAGS: 00010282 [ 1133.716642] RAX: 0000000000000000 RBX: ffff881fea0b7400 RCX: 0000000000000000 [ 1133.716643] RDX: ffffc900303c7bb4 RSI: ffffffff8235c3e0 RDI: ffff881fea0b7400 [ 1133.716643] RBP: ffffc900303c7b80 R08: 0000000000000000 R09: 000000000000000e [ 1133.716644] R10: ffffc900303c7bb4 R11: ffff881fb6840400 R12: ffffffff8235c3e0 [ 1133.716645] R13: 0000000000000008 R14: 000000000000001e R15: ffffc900303c7bb4 [ 1133.716646] FS: 00007f54e75d3740(0000) GS:ffff881fff5c0000(0000) knlGS:0000000000000000 [ 1133.716648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1133.716649] CR2: 0000000000001030 CR3: 0000001f6c226005 CR4: 00000000003606e0 [ 1133.716649] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1133.716650] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1133.716651] Call Trace: [ 1133.716660] ? sched_clock_cpu+0xc/0xa0 [ 1133.716662] ? sched_clock_cpu+0xc/0xa0 [ 1133.716665] ? log_store+0x1b5/0x260 [ 1133.716667] ? up+0x12/0x60 [ 1133.716669] ? skb_get_poff+0x4b/0xa0 [ 1133.716674] ? __kmalloc_reserve.isra.47+0x2e/0x80 [ 1133.716675] skb_get_poff+0x4b/0xa0 [ 1133.716680] bpf_skb_get_pay_offset+0xa/0x10 [ 1133.716686] ? test_bpf_init+0x578/0x1000 [test_bpf] [ 1133.716690] ? netlink_broadcast_filtered+0x153/0x3d0 [ 1133.716695] ? free_pcppages_bulk+0x324/0x600 [ 1133.716696] ? 0xffffffffa0279000 [ 1133.716699] ? do_one_initcall+0x46/0x1bd [ 1133.716704] ? kmem_cache_alloc_trace+0x144/0x1a0 [ 1133.716709] ? do_init_module+0x5b/0x209 [ 1133.716712] ? load_module+0x2136/0x25d0 [ 1133.716715] ? __do_sys_finit_module+0xba/0xe0 [ 1133.716717] ? __do_sys_finit_module+0xba/0xe0 [ 1133.716719] ? do_syscall_64+0x48/0x100 [ 1133.716724] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 This patch fixes tes_bpf by using init_net in the dummy dev. Fixes: `d58e468b11` ("flow_dissector: implements flow dissector BPF hook") Reported-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Petar Penkov <ppenkov@google.com> Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-27 21:09:45 +02:00
Andrey Ignatov	53d6eb08e9	bpftool: Fix bpftool net output Print `bpftool net` output to stdout instead of stderr. Only errors should be printed to stderr. Regular output should go to stdout and this is what all other subcommands of bpftool do, including --json and --pretty formats of `bpftool net` itself. Fixes: commit `f6f3bac08f` ("tools/bpf: bpftool: add net support") Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-27 21:07:44 +02:00
Maciej Żenczykowski	1042caa79e	net-ipv4: remove 2 always zero parameters from ipv4_redirect() (the parameters in question are mark and flow_flags) Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 20:30:55 -07:00
Maciej Żenczykowski	d888f39666	net-ipv4: remove 2 always zero parameters from ipv4_update_pmtu() (the parameters in question are mark and flow_flags) Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 20:30:55 -07:00
Maxime Chevallier	da58a931f2	net: mvneta: Add support for 2500Mbps SGMII The mvneta controller can handle speeds up to 2500Mbps on the SGMII interface. This relies on serdes configuration, the lane must be configured at 3.125Gbps and we can't use in-band autoneg at that speed. The main issue when supporting that speed on this particular controller is that the link partner can send ethernet frames with a shortened preamble, which if not explicitly enabled in the controller will cause unexpected behaviours. This was tested on Armada 385, with the comphy configuration done in bootloader. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 20:27:09 -07:00
David S. Miller	c09c1474d8	Merge branch 'net-vhost-improve-performance-when-enable-busyloop' Tonghao Zhang says: ==================== net: vhost: improve performance when enable busyloop This patches improve the guest receive performance. On the handle_tx side, we poll the sock receive queue at the same time. handle_rx do that in the same way. For more performance report, see patch 4 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 20:25:55 -07:00
Tonghao Zhang	441abde4cd	net: vhost: add rx busy polling in tx path This patch improves the guest receive performance. On the handle_tx side, we poll the sock receive queue at the same time. handle_rx do that in the same way. We set the poll-us=100us and use the netperf to test throughput and mean latency. When running the tests, the vhost-net kthread of that VM, is alway 100% CPU. The commands are shown as below. Rx performance is greatly improved by this patch. There is not notable performance change on tx with this series though. This patch is useful for bi-directional traffic. netperf -H IP -t TCP_STREAM -l 20 -- -O "THROUGHPUT, THROUGHPUT_UNITS, MEAN_LATENCY" Topology: [Host] ->linux bridge -> tap vhost-net ->[Guest] TCP_STREAM: * Without the patch: 19842.95 Mbps, 6.50 us mean latency * With the patch: 37598.20 Mbps, 3.43 us mean latency Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 20:25:55 -07:00
Tonghao Zhang	dc151282bb	net: vhost: factor out busy polling logic to vhost_net_busy_poll() Factor out generic busy polling logic and will be used for in tx path in the next patch. And with the patch, qemu can set differently the busyloop_timeout for rx queue. To avoid duplicate codes, introduce the helper functions: * sock_has_rx_data(changed from sk_has_rx_data) * vhost_net_busy_poll_try_queue Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 20:25:55 -07:00
Tonghao Zhang	a6a67a2f34	net: vhost: replace magic number of lock annotation Use the VHOST_NET_VQ_XXX as a subclass for mutex_lock_nested. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 20:25:55 -07:00
Tonghao Zhang	78139c94dc	net: vhost: lock the vqs one by one This patch changes the way that lock all vqs at the same, to lock them one by one. It will be used for next patch to avoid the deadlock. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 20:25:54 -07:00
Yafang Shao	af4325ecc2	tcp: expose sk_state in tcp_retransmit_skb tracepoint After sk_state exposed, we can get in which state this retransmission occurs. That could give us more detail for dignostic. For example, if this retransmission occurs in SYN_SENT state, it may also indicates that the syn packet may be dropped on the remote peer due to syn backlog queue full and then we could check the remote peer. BTW,SYNACK retransmission is traced in tcp_retransmit_synack tracepoint. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 20:07:19 -07:00
YueHaibing	0a71515665	net: faraday: fix return type of ndo_start_xmit function The method ndo_start_xmit() is defined as returning an 'netdev_tx_t', which is a typedef for an enum type, so make sure the implementation in this driver has returns 'netdev_tx_t' value, and change the function return type to netdev_tx_t. Found by coccinelle. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 10:18:08 -07:00
YueHaibing	6323d57f33	net: smsc: fix return type of ndo_start_xmit function The method ndo_start_xmit() is defined as returning an 'netdev_tx_t', which is a typedef for an enum type, so make sure the implementation in this driver has returns 'netdev_tx_t' value, and change the function return type to netdev_tx_t. Found by coccinelle. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 10:15:17 -07:00
zhong jiang	880e1b2111	net: liquidio: list usage cleanup Trival cleanup, list_move_tail will implement the same function that list_del() + list_add_tail() will do. hence just replace them. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 10:12:10 -07:00
zhong jiang	631e871edc	net: qed: list usage cleanup Trival cleanup, list_move_tail will implement the same function that list_del() + list_add_tail() will do. hence just replace them. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 10:11:36 -07:00
David S. Miller	30b0594a3e	Merge branch 'net-bridge-convert-bool-options-to-bits' Nikolay Aleksandrov says: ==================== net: bridge: convert bool options to bits A lot of boolean bridge options have been added around the net_bridge structure resulting in holes and more importantly different cache lines that need to be fetched in the fast path. This set moves all of those to bits in a bitfield which resides in a hot cache line thus reducing the size of net_bridge, the number of holes and the number of cache lines needed for the fast path. The set is also sent in preparation for new boolean options to avoid spreading them in the structure and making new holes. One nice side-effect is that we avoid potential race conditions by using the bitops since some of the options were bits being directly set in parallel risking hard to debug issues (has_ipv6_addr). Before: size: 1184, holes: 8, sum holes: 30 After: size: 1160, holes: 3, sum holes: 7 Patch 01 is a trivial style fix Patch 02 adds the new options bitfield and converts the vlan boolean options to bits Patches 03-08 convert the rest of the boolean options to bits Patch 09 re-arranges a few fields in net_bridge to further reduce size v2: patch 09: remove the comment about offload_fwd_mark in net_bridge and leave it where it is now, thanks to Ido for spotting it ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 10:04:23 -07:00
Nikolay Aleksandrov	35750b0bca	net: bridge: pack net_bridge better Further reduce the size of net_bridge with 8 bytes and reduce the number of holes in it: Before: holes: 5, sum holes: 15 After: holes: 3, sum holes: 7 Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 10:04:23 -07:00
Nikolay Aleksandrov	3341d91702	net: bridge: convert mtu_set_by_user to a bit Convert the last remaining bool option to a bit thus reducing the overall net_bridge size further by 8 bytes. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 10:04:23 -07:00
Nikolay Aleksandrov	c69c2cd444	net: bridge: convert neigh_suppress_enabled option to a bit Convert the neigh_suppress_enabled option to a bit. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 10:04:23 -07:00
Nikolay Aleksandrov	675779adbf	net: bridge: convert mcast options to bits This patch converts the rest of the mcast options to bits. It also packs the mcast options a little better by moving multicast_mld_version to an existing hole, reducing the net_bridge size by 8 bytes. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 10:04:23 -07:00
Nikolay Aleksandrov	13cefad2f2	net: bridge: convert and rename mcast disabled Convert mcast disabled to an option bit and while doing so convert the logic to check if multicast is enabled instead. That is make the logic follow the option value - if it's set then mcast is enabled and vice versa. This avoids a few confusing places where we inverted the value that's being set to follow the mcast_disabled logic. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 10:04:23 -07:00
Nikolay Aleksandrov	be3664a038	net: bridge: convert group_addr_set option to a bit Convert group_addr_set internal bridge opt to a bit. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 10:04:23 -07:00
Nikolay Aleksandrov	8df3510f28	net: bridge: convert nf call options to bits No functional change, convert of nf_call_[ip\|ip6\|arp]tables to bits. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 10:04:23 -07:00
Nikolay Aleksandrov	ae75767ec2	net: bridge: add bitfield for options and convert vlan opts Bridge options have usually been added as separate fields all over the net_bridge struct taking up space and ending up in different cache lines. Let's move them to a single bitfield to save up space and speedup lookups. This patch adds a simple API for option modifying and retrieving using bitops and converts the first user of the API - the bridge vlan options (vlan_enabled and vlan_stats_enabled). Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 10:04:22 -07:00
Nikolay Aleksandrov	1c1cb6d032	net: bridge: make struct opening bracket consistent Currently we have a mix of opening brackets on new lines and on the same line, let's move them all on the same line. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 10:04:22 -07:00
David S. Miller	37ac5db6e6	Merge branch 's390-net-next' Julian Wiedmann says: ==================== s390/net: updates 2018-09-26 please apply one more series of cleanups and small improvements for qeth to net-next. Note that one patch needs to touch both af_iucv and qeth, in order to untangle their receive paths. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 09:56:08 -07:00
Julian Wiedmann	91cc98f51e	s390/qeth: remove duplicated carrier state tracking The netdevice is always available, apply any carrier state changes to it without caching them. On a STARTLAN event (ie. carrier-up), defer updating the state to qeth_core_hardsetup_card() in the subsequent recovery action. Also remove the carrier-state checks from the xmit routines. Stopping transmission on carrier-down is the responsibility of upper-level code (eg see dev_direct_xmit()). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 09:56:08 -07:00
Julian Wiedmann	d782d80f36	s390/qeth: clean up drop conditions for received cmds If qeth_check_ipa_data() consumed an event, there's no point in processing it further. So drop it early, and make the surrounding code a tiny bit more readable. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 09:56:07 -07:00
Julian Wiedmann	d19b93f40e	s390/qeth: re-indent qeth_check_ipa_data() Pull one level of checking up into qeth_send_control_data_cb(), and clean up an else-after-return. No functional change. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 09:56:07 -07:00
Julian Wiedmann	68bba11643	s390/qeth: consume local address events We have no code that is waiting for these events, so just drop them when they arrive. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 09:56:07 -07:00
Julian Wiedmann	6585ac4e5d	s390/qeth: remove various redundant code 1. tracing iob->rc makes no sense when it hasn't been modified by the callback, 2. the qeth_dbf_list is declared with LIST_HEAD, which also initializes the list, 3. the ccwgroup core only calls the thaw/restore callbacks if the gdev is online, so we don't have to check for it again. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 09:56:07 -07:00
Julian Wiedmann	8d908eb045	s390/qeth: remove CARD_FROM_CDEV helper The cdev-to-card translation walks through two layers of drvdata, with no locking or refcounting (where eg. the ccwgroup core only accesses a cdev's drvdata while holding the ccwlock). This might be safe for now, but any careless usage of the helper has the potential for subtle races and use-after-free's. Luckily there's only one occurrence where we _really_ need it (in qeth_irq()), for any other user we can just pass through an appropriate card pointer. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 09:56:07 -07:00
Julian Wiedmann	8f6637b878	s390/qeth: pass card pointer in iob callback This allows us to remove the CARD_FROM_CDEV calls in the iob callbacks. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 09:56:07 -07:00
Julian Wiedmann	6a3123d076	s390/qeth: re-use qeth_notify_skbs() When not using the CQ, this allows us avoid the second skb queue walk in qeth_release_skbs(). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 09:56:07 -07:00
Julian Wiedmann	5a5312bdba	s390/qeth: remove additional skb refcount This was presumably left over from back when qeth recursed into dev_queue_xmit(). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 09:56:07 -07:00
Julian Wiedmann	dc149e3764	s390/qeth: replace open-coded skb_queue_walk() To match the use of __skb_queue_purge(), also make the skb's enqueue in qeth_fill_buffer() lockless. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 09:56:07 -07:00
Julian Wiedmann	cd11d11286	net/af_iucv: locate IUCV header via skb_network_header() This patch attempts to untangle the TX and RX code in qeth from af_iucv's respective HiperTransport path: On the TX side, pointing skb_network_header() at the IUCV header means that qeth_l3_fill_af_iucv_hdr() no longer needs a magical offset to access the header. On the RX side, qeth pulls the (fake) L2 header off the skb like any normal ethernet driver would. This makes working with the IUCV header in af_iucv easier, since we no longer have to assume a fixed skb layout. While at it, replace the open-coded length checks in af_iucv's RX path with pskb_may_pull(). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 09:56:07 -07:00
Julian Wiedmann	a2eb0ad50c	s390/qeth: on gdev release, reset drvdata qeth_core_probe_device() sets the gdev's drvdata, but doesn't reset it on a subsequent error. Move the (re-)setting around a bit, so that it happens symmetrically on allocating/freeing the qeth_card struct. This is no actual problem, as the ccwgroup core will discard the gdev on a probe error. But from qeth's perspective the gdev is an external resource, so it's best to manage it cleanly. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-26 09:56:07 -07:00

1 2 3 4 5 ...

783637 Commits All Branches Search

783637 Commits

All Branches