linux

Commit Graph

Author	SHA1	Message	Date
David S. Miller	e9e90a70cc	Merge branch 'resil-nhgroups-netdevsim-selftests' Petr Machata says: ==================== net: Resilient NH groups: netdevsim, selftests Support for resilient next-hop groups was added in a previous patch set. Resilient next hop groups add a layer of indirection between the SKB hash and the next hop. Thus the hash is used to reference a hash table bucket, which is then used to reference a particular next hop. This allows the system more flexibility when assigning SKB hash space to next hops. Previously, each next hop had to be assigned a continuous range of SKB hash space. With a hash table as an intermediate layer, it is possible to reassign next hops with a hash table bucket granularity. In turn, this mends issues with traffic flow redirection resulting from next hop removal or adjustments in next-hop weights. This patch set introduces mock offloading of resilient next hop groups by the netdevsim driver, and a suite of selftests. - Patch #1 adds a netdevsim-specific lock to protect next-hop hashtable. Previously, netdevsim relied on RTNL to maintain mutual exclusion. Patch #2 extracts a helper to make the following patches clearer. - Patch #3 implements the support for offloading of resilient next-hop groups. - Patch #4 introduces a new debugfs interface to set activity on a selected next-hop bucket. This simulates how HW can periodically report bucket activity, and buckets thus marked are expected to be exempt from migration to new next hops when the group changes. - Patches #5 and #6 clean up the fib_nexthop selftests. - Patches #7, #8 and #9 add tests for resilient next hop groups. Patch #7 adds resilient-hashing counterparts to fib_nexthops.sh. Patch #8 adds a new traffic test for resilient next-hop groups. Patch #9 adds a new traffic test for tunneling. - Patch #10 actually leverages the netdevsim offload to implement a suite of algorithmic tests that verify how and when buckets are migrated under various simulated workload scenarios. The overall plan is to contribute approximately the following patchsets: 1) Nexthop policy refactoring (already pushed) 2) Preparations for resilient next hop groups (already pushed) 3) Implementation of resilient next hop group (already pushed) 4) Netdevsim offload plus a suite of selftests (this patchset) 5) Preparations for mlxsw offload of resilient next-hop groups 6) mlxsw offload including selftests Interested parties can look at the complete code at [2]. [1] https://tools.ietf.org/html/rfc2992 [2] https://github.com/idosch/linux/commits/submit/res_integ_v1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
Ido Schimmel	b8a07c4cea	selftests: netdevsim: Add test for resilient nexthop groups offload API Test various aspects of the resilient nexthop group offload API on top of the netdevsim implementation. Both good and bad flows are tested. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Co-developed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
Ido Schimmel	902280cacc	selftests: forwarding: Add resilient multipath tunneling nexthop test Add a resilient nexthop objects version of gre_multipath_nh.sh. Test that both IPv4 and IPv6 overlays work with resilient nexthop groups where the nexthops are two GRE tunnels. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
Ido Schimmel	386e3792b5	selftests: forwarding: Add resilient hashing test Verify that IPv4 and IPv6 multipath forwarding works correctly with resilient nexthop groups and with different weights. Test that when the idle timer is not zero, the resilient groups are not rebalanced - because the nexthop buckets are considered active - and the initial weights (1:1) are used. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
Ido Schimmel	557205f47d	selftests: fib_nexthops: Test resilient nexthop groups Add test cases for resilient nexthop groups. Exhaustive forwarding tests are added separately under net/forwarding/. Examples: # ./fib_nexthops.sh -t basic_res Basic resilient nexthop group functional tests ---------------------------------------------- TEST: Add a nexthop group with default parameters [ OK ] TEST: Get a nexthop group with default parameters [ OK ] TEST: Get a nexthop group with non-default parameters [ OK ] TEST: Add a nexthop group with 0 buckets [ OK ] TEST: Replace nexthop group parameters [ OK ] TEST: Get a nexthop group after replacing parameters [ OK ] TEST: Replace idle timer [ OK ] TEST: Get a nexthop group after replacing idle timer [ OK ] TEST: Replace unbalanced timer [ OK ] TEST: Get a nexthop group after replacing unbalanced timer [ OK ] TEST: Replace with no parameters [ OK ] TEST: Get a nexthop group after replacing no parameters [ OK ] TEST: Replace nexthop group type - implicit [ OK ] TEST: Replace nexthop group type - explicit [ OK ] TEST: Replace number of nexthop buckets [ OK ] TEST: Get a nexthop group after replacing with invalid parameters [ OK ] TEST: Dump all nexthop buckets [ OK ] TEST: Dump all nexthop buckets in a group [ OK ] TEST: Dump all nexthop buckets with a specific nexthop device [ OK ] TEST: Dump all nexthop buckets with a specific nexthop identifier [ OK ] TEST: Dump all nexthop buckets in a non-existent group [ OK ] TEST: Dump all nexthop buckets in a non-resilient group [ OK ] TEST: Dump all nexthop buckets using a non-existent device [ OK ] TEST: Dump all nexthop buckets with invalid 'groups' keyword [ OK ] TEST: Dump all nexthop buckets with invalid 'fdb' keyword [ OK ] TEST: Get a valid nexthop bucket [ OK ] TEST: Get a nexthop bucket with valid group, but invalid index [ OK ] TEST: Get a nexthop bucket from a non-resilient group [ OK ] TEST: Get a nexthop bucket from a non-existent group [ OK ] Tests passed: 29 Tests failed: 0 # ./fib_nexthops.sh -t ipv4_large_res_grp IPv4 large resilient group (128k buckets) ----------------------------------------- TEST: Dump large (x131072) nexthop buckets [ OK ] Tests passed: 1 Tests failed: 0 # ./fib_nexthops.sh -t ipv6_large_res_grp IPv6 large resilient group (128k buckets) ----------------------------------------- TEST: Dump large (x131072) nexthop buckets [ OK ] Tests passed: 1 Tests failed: 0 # ./fib_nexthops.sh -t ipv4_res_torture IPv4 runtime resilient nexthop group torture -------------------------------------------- TEST: IPv4 resilient nexthop group torture test [ OK ] Tests passed: 1 Tests failed: 0 # ./fib_nexthops.sh -t ipv6_res_torture IPv6 runtime resilient nexthop group torture -------------------------------------------- TEST: IPv6 resilient nexthop group torture test [ OK ] Tests passed: 1 Tests failed: 0 # ./fib_nexthops.sh -t ipv4_res_grp_fcnal IPv4 resilient groups functional -------------------------------- TEST: Nexthop group updated when entry is deleted [ OK ] TEST: Nexthop buckets updated when entry is deleted [ OK ] TEST: Nexthop group updated after replace [ OK ] TEST: Nexthop buckets updated after replace [ OK ] TEST: Nexthop group updated when entry is deleted - nECMP [ OK ] TEST: Nexthop buckets updated when entry is deleted - nECMP [ OK ] TEST: Nexthop group updated after replace - nECMP [ OK ] TEST: Nexthop buckets updated after replace - nECMP [ OK ] Tests passed: 8 Tests failed: 0 # ./fib_nexthops.sh -t ipv6_res_grp_fcnal IPv6 resilient groups functional -------------------------------- TEST: Nexthop group updated when entry is deleted [ OK ] TEST: Nexthop buckets updated when entry is deleted [ OK ] TEST: Nexthop group updated after replace [ OK ] TEST: Nexthop buckets updated after replace [ OK ] TEST: Nexthop group updated when entry is deleted - nECMP [ OK ] TEST: Nexthop buckets updated when entry is deleted - nECMP [ OK ] TEST: Nexthop group updated after replace - nECMP [ OK ] TEST: Nexthop buckets updated after replace - nECMP [ OK ] Tests passed: 8 Tests failed: 0 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Co-developed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
Ido Schimmel	a8f9952d21	selftests: fib_nexthops: List each test case in a different line The lines with the IPv4 and IPv6 test cases are already very long and more test cases will be added in subsequent patches. List each test case in a different line to make it easier to extend the test with more test cases. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
Ido Schimmel	8e815284a5	selftests: fib_nexthops: Declutter test output Before: # ./fib_nexthops.sh -t ipv4_torture IPv4 runtime torture -------------------- TEST: IPv4 torture test [ OK ] ./fib_nexthops.sh: line 213: 19376 Killed ipv4_del_add_loop1 ./fib_nexthops.sh: line 213: 19377 Killed ipv4_grp_replace_loop ./fib_nexthops.sh: line 213: 19378 Killed ip netns exec me ping -f 172.16.101.1 > /dev/null 2>&1 ./fib_nexthops.sh: line 213: 19380 Killed ip netns exec me ping -f 172.16.101.2 > /dev/null 2>&1 ./fib_nexthops.sh: line 213: 19381 Killed ip netns exec me mausezahn veth1 -B 172.16.101.2 -A 172.16.1.1 -c 0 -t tcp "dp=1-1023, flags=syn" > /dev/null 2>&1 Tests passed: 1 Tests failed: 0 # ./fib_nexthops.sh -t ipv6_torture IPv6 runtime torture -------------------- TEST: IPv6 torture test [ OK ] ./fib_nexthops.sh: line 213: 24453 Killed ipv6_del_add_loop1 ./fib_nexthops.sh: line 213: 24454 Killed ipv6_grp_replace_loop ./fib_nexthops.sh: line 213: 24456 Killed ip netns exec me ping -f 2001:db8:101::1 > /dev/null 2>&1 ./fib_nexthops.sh: line 213: 24457 Killed ip netns exec me ping -f 2001:db8:101::2 > /dev/null 2>&1 ./fib_nexthops.sh: line 213: 24458 Killed ip netns exec me mausezahn -6 veth1 -B 2001:db8:101::2 -A 2001:db8:91::1 -c 0 -t tcp "dp=1-1023, flags=syn" > /dev/null 2>&1 Tests passed: 1 Tests failed: 0 After: # ./fib_nexthops.sh -t ipv4_torture IPv4 runtime torture -------------------- TEST: IPv4 torture test [ OK ] Tests passed: 1 Tests failed: 0 # ./fib_nexthops.sh -t ipv6_torture IPv6 runtime torture -------------------- TEST: IPv6 torture test [ OK ] Tests passed: 1 Tests failed: 0 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
Ido Schimmel	c6385c0b67	netdevsim: Allow reporting activity on nexthop buckets A key component of the resilient hashing algorithm is the hash buckets' activity. If a bucket is active, it will not be populated with a new nexthop in order not to break existing flows. Therefore, in order to easily and thoroughly test the algorithm, we need to be in full control over the reported activity. Add a debugfs interface that allows user space to have netdevsim report a nexthop bucket within a resilient nexthop group as active. For example: # echo 10 23 > /sys/kernel/debug/netdevsim/netdevsim10/fib/nexthop_bucket_activity Will mark bucket 23 in nexthop group 10 as active. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
Ido Schimmel	d8eaa4faca	netdevsim: Add support for resilient nexthop groups Allow resilient nexthop groups to be programmed and account their occupancy according to their number of buckets. The nexthop group itself as well as its buckets are marked with hardware flags (i.e., 'RTNH_F_TRAP'). Replacement of a single nexthop bucket can fail using the following debugfs knob: # cat /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_nexthop_bucket_replace N # echo 1 > /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_nexthop_bucket_replace # cat /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_nexthop_bucket_replace Y Replacement of a resilient nexthop group can fail using the following debugfs knob: # cat /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_res_nexthop_group_replace N # echo 1 > /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_res_nexthop_group_replace # cat /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_res_nexthop_group_replace Y This enables testing of various error paths. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
Ido Schimmel	40ff83711f	netdevsim: Create a helper for setting nexthop hardware flags Instead of calling nexthop_set_hw_flags(), call a helper. It will be used to also set nexthop bucket flags in a subsequent patch. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
Petr Machata	86927c9c4d	netdevsim: fib: Introduce a lock to guard nexthop hashtable Currently netdevsim relies on RTNL to maintain exclusivity in accessing the nexthop hash table. However, bucket notification may be called without RTNL having been held. Instead, introduce a custom lock to guard the table. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
David S. Miller	b202923d3a	Merge branch 'ptp-warnings' Lee Jones says: ==================== Rid W=1 warnings from PTP This set is part of a larger effort attempting to clean-up W=1 kernel builds, which are currently overwhelmingly riddled with niggly little warnings. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:09:34 -08:00
Lee Jones	287f93ded6	ptp: ptp_p: Demote non-conformant kernel-doc headers and supply a param description Fixes the following W=1 kernel build warning(s): drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'control' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'event' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'addend' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'accum' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'test' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'ts_compare' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'rsystime_lo' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'rsystime_hi' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'systime_lo' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'systime_hi' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'trgt_lo' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'trgt_hi' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'asms_lo' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'asms_hi' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'amms_lo' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'amms_hi' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'ch_control' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'ch_event' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'tx_snap_lo' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'tx_snap_hi' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'rx_snap_lo' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'rx_snap_hi' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'src_uuid_lo' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'src_uuid_hi' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'can_status' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'can_snap_lo' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'can_snap_hi' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'ts_sel' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'ts_st' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'reserve1' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'stl_max_set_en' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'stl_max_set' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'reserve2' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'srst' not described in 'pch_ts_regs' drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'regs' not described in 'pch_dev' drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'ptp_clock' not described in 'pch_dev' drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'caps' not described in 'pch_dev' drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'exts0_enabled' not described in 'pch_dev' drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'exts1_enabled' not described in 'pch_dev' drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'mem_base' not described in 'pch_dev' drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'mem_size' not described in 'pch_dev' drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'irq' not described in 'pch_dev' drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'pdev' not described in 'pch_dev' drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'register_lock' not described in 'pch_dev' drivers/ptp/ptp_pch.c:128: warning: Function parameter or member 'station' not described in 'pch_params' drivers/ptp/ptp_pch.c:291: warning: Function parameter or member 'pdev' not described in 'pch_set_station_address' Cc: Richard Cochran <richardcochran@gmail.com> Cc: LAPIS SEMICONDUCTOR <tshimizu818@gmail.com> Cc: netdev@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:09:34 -08:00
Lee Jones	9ec04c71ab	ptp: ptp_clockmatrix: Demote non-kernel-doc header to standard comment Fixes the following W=1 kernel build warning(s): drivers/ptp/ptp_clockmatrix.c:1408: warning: Cannot understand * @brief Maximum absolute value for write phase offset in picoseconds drivers/ptp/ptp_clockmatrix.c:1408: warning: Cannot understand * @brief Maximum absolute value for write phase offset in picoseconds drivers/ptp/ptp_clockmatrix.c:1408: warning: Cannot understand * @brief Maximum absolute value for write phase offset in picoseconds drivers/ptp/ptp_clockmatrix.c:1408: warning: Cannot understand * @brief Maximum absolute value for write phase offset in picoseconds drivers/ptp/ptp_clockmatrix.c:1408: warning: Cannot understand * @brief Maximum absolute value for write phase offset in picoseconds Cc: Richard Cochran <richardcochran@gmail.com> Cc: IDT-support-1588@lm.renesas.com Cc: netdev@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:09:34 -08:00
Lee Jones	f90fc37f28	ptp_pch: Move 'pch_*()' prototypes to shared header Fixes the following W=1 kernel build warning(s): drivers/ptp/ptp_pch.c:193:6: warning: no previous prototype for ‘pch_ch_control_write’ [-Wmissing-prototypes] drivers/ptp/ptp_pch.c:201:5: warning: no previous prototype for ‘pch_ch_event_read’ [-Wmissing-prototypes] drivers/ptp/ptp_pch.c:212:6: warning: no previous prototype for ‘pch_ch_event_write’ [-Wmissing-prototypes] drivers/ptp/ptp_pch.c:220:5: warning: no previous prototype for ‘pch_src_uuid_lo_read’ [-Wmissing-prototypes] drivers/ptp/ptp_pch.c:231:5: warning: no previous prototype for ‘pch_src_uuid_hi_read’ [-Wmissing-prototypes] drivers/ptp/ptp_pch.c:242:5: warning: no previous prototype for ‘pch_rx_snap_read’ [-Wmissing-prototypes] drivers/ptp/ptp_pch.c:259:5: warning: no previous prototype for ‘pch_tx_snap_read’ [-Wmissing-prototypes] drivers/ptp/ptp_pch.c:300:5: warning: no previous prototype for ‘pch_set_station_address’ [-Wmissing-prototypes] Cc: Richard Cochran <richardcochran@gmail.com> (maintainer:PTP HARDWARE CLOCK SUPPORT) Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Flavio Suligoi <f.suligoi@asem.it> Cc: netdev@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:09:34 -08:00
Lee Jones	257382c54e	ptp_pch: Remove unused function 'pch_ch_control_read()' Fixes the following W=1 kernel build warning(s): drivers/ptp/ptp_pch.c:182:5: warning: no previous prototype for ‘pch_ch_control_read’ [-Wmissing-prototypes] Cc: Richard Cochran <richardcochran@gmail.com> (maintainer:PTP HARDWARE CLOCK SUPPORT) Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Flavio Suligoi <f.suligoi@asem.it> Cc: netdev@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:09:34 -08:00
Rafał Miłecki	a9349f08ec	net: dsa: bcm_sf2: setup BCM4908 internal crossbar On some SoCs (e.g. BCM4908, BCM631[345]8) SF2 has an integrated crossbar. It allows connecting its selected external ports to internal ports. It's used by vendors to handle custom Ethernet setups. BCM4908 has following 3x2 crossbar. On Asus GT-AC5300 rgmii is used for connecting external BCM53134S switch. GPHY4 is usually used for WAN port. More fancy devices use SerDes for 2.5 Gbps Ethernet. ┌──────────┐ SerDes ─── 0 ─┤ │ │ 3x2 ├─ 0 ─── switch port 7 GPHY4 ─── 1 ─┤ │ │ crossbar ├─ 1 ─── runner (accelerator) rgmii ─── 2 ─┤ │ └──────────┘ Use setup data based on DT info to configure BCM4908's switch port 7. Right now only GPHY and rgmii variants are supported. Handling SerDes can be implemented later. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:06:37 -08:00
Rafał Miłecki	01488a0ccd	net: dsa: bcm_sf2: store PHY interface/mode in port structure It's needed later for proper switch / crossbar setup. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:06:37 -08:00
Shubhankar Kuranagatti	6ad086009f	net: ipv4: route.c: Fix indentation of multi line comment. All comment lines inside the comment block have been aligned. Every line of comment starts with a * (uniformity in code). Signed-off-by: Shubhankar Kuranagatti <shubhankarvk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:02:30 -08:00
Rafał Miłecki	12bb508bfe	net: broadcom: bcm4908_enet: support TX interrupt It appears that each DMA channel has its own interrupt and both rings can be configured (the same way) to handle interrupts. 1. Make ring interrupts code generic (make it operate on given ring) 2. Move napi to ring (so each has its own) 3. Make IRQ handler generic (match ring against received IRQ number) 4. Add (optional) support for TX interrupt Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 16:48:38 -08:00
Rafał Miłecki	ab4dda7a8c	dt-bindings: net: bcm4908-enet: add optional TX interrupt I discovered that hardware actually supports two interrupts, one per DMA channel (RX and TX). Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 16:48:38 -08:00
David S. Miller	26d2e0426a	Merge branch 'macb-fixed-link-fixes' Robert Hancock says: ==================== macb SGMII fixed-link fixes Some fixes to the macb driver for use in SGMII mode with a fixed-link (such as for chip-to-chip connectivity). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 16:44:46 -08:00
Robert Hancock	e276e5e40e	net: macb: Disable PCS auto-negotiation for SGMII fixed-link mode When using a fixed-link configuration in SGMII mode, it's not really sensible to have auto-negotiation enabled since the link settings are fixed by definition. In other configurations, such as an SGMII connection to a PHY, it should generally be enabled. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 16:44:45 -08:00
Robert Hancock	8fab174b78	net: macb: poll for fixed link state in SGMII mode When using a fixed-link configuration with GEM in SGMII mode, such as for a chip-to-chip interconnect, the link state was always showing as established regardless of the actual connectivity state. We can monitor the pcs_link_state bit in the Network Status register to determine whether the PCS link state is actually up. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 16:44:45 -08:00
David S. Miller	c232f81b0a	mlx5-updates-2021-03-12 1) TC support for ICMP parameters 2) TC connection tracking with mirroring 3) A round of trivial fixups and cleanups -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmBL+V4ACgkQSD+KveBX +j68tgf+Lzq6QkpGl4dpg87pjuKIwyykGVJ0NW/Yig+bNe9V3pbFCv785xymD16N ALcLKNtxUovE6p+Gy6JgzgBN7V9ZsM6bn564487ycYIqewz9c2KIQneJdGEUUDgi S8NkuCaLQst36Wl8gdiygApZmw8TAafOQYYzKOZ7HdZ+aRH+fIwbVABYreAYwjGa 2iz5yu1AEzG5/10bTEx69rQHY+309EPy4N4na/jo/tXRL3fWQSQz6hzEnoi3EjAU YsktBB72eBX/NogqLIHpSzzjPgm5IFRQbNSHyL71TH9KEr0KVpzJd9VCbaic6goL gQn0TqoE74X7r4ZPWP3AHcGOL9L5Rg== =L2PO -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2021-03-12' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2021-03-12 1) TC support for ICMP parameters 2) TC connection tracking with mirroring 3) A round of trivial fixups and cleanups ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 16:36:07 -08:00
Maor Dickman	a3222a2da0	net/mlx5e: Allow to match on ICMP parameters Support matching on ICMPv4/6 type and code parameters using misc3 section of match parameters. Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 15:29:34 -08:00
Paul Blakey	69e2916ebc	net/mlx5: CT: Add support for mirroring Add support for mirroring before the CT action by spliting the pre ct rule. Mirror outputs are done first on the tc chain,prio table rule (the fwd rule), which will then forward to a per port fwd table. On this fwd table, we insert the original pre ct rule that forwards to ct/ct nat table. Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 15:29:33 -08:00
Alaa Hleihel	287e0df021	net/mlx5: Display the command index in command mailbox dump Multiple commands can be printed at the same time which can lead to wrong order of their lines in dmesg output. As a result, it's hard to match data dumps to the correct command or which command was fully dumped at some point. Fix this by displaying the corresponding command index, and also indicate when a command was fully dumped. Signed-off-by: Alaa Hleihel <alaa@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 15:29:33 -08:00
Arnd Bergmann	2119bda642	net/mlx5e: allocate 'indirection_rqt' buffer dynamically Increasing the size of the indirection_rqt array from 128 to 256 bytes pushed the stack usage of the mlx5e_hairpin_fill_rqt_rqns() function over the warning limit when building with clang and CONFIG_KASAN: drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:970:1: error: stack frame size of 1180 bytes in function 'mlx5e_tc_add_nic_flow' [-Werror,-Wframe-larger-than=] Using dynamic allocation here is safe because the caller does the same, and it reduces the stack usage of the function to just a few bytes. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 15:29:32 -08:00
Tariq Toukan	e16cf9d754	net/mlx5e: Dump ICOSQ WQE descriptor on CQE with error events Dump the ICOSQ's WQE descriptor when a completion with error is received. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 15:29:32 -08:00
Maxim Mikityanskiy	991b265460	net/mlx5e: Use net_prefetchw instead of prefetchw in MPWQE TX datapath Commit `e20f0dbf20` ("net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES") switched to using net_prefetchw at all places in mlx5e. In the same time frame, commit `5af75c747e` ("net/mlx5e: Enhanced TX MPWQE for SKBs") added one more usage of prefetchw. When these two changes were merged, this new occurrence of prefetchw wasn't replaced with net_prefetchw. This commit fixes this last occurrence of prefetchw in mlx5e_tx_mpwqe_session_start, making the same change that was done in mlx5e_xdp_mpwqe_session_start. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 15:29:31 -08:00
Roi Dayan	bca08a9145	net/mlx5e: Remove redundant newline in NL_SET_ERR_MSG_MOD Fix the following coccicheck warnings: drivers/net/ethernet/mellanox/mlx5/core/devlink.c:145:29-66: WARNING avoid newline at end of message in NL_SET_ERR_MSG_MOD drivers/net/ethernet/mellanox/mlx5/core/devlink.c:140:29-77: WARNING avoid newline at end of message in NL_SET_ERR_MSG_MOD Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 15:29:31 -08:00
Mark Zhang	093bd76469	net/mlx5: Read congestion counters from all ports when lag is active Read congestion counters from all ports in any lag mode rather than only in RoCE lag mode (e.g., VF lag). Signed-off-by: Mark Zhang <markzhang@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 15:29:31 -08:00
Jiapeng Chong	7976092241	net/mlx5: remove unneeded semicolon Fix the following coccicheck warnings: ./drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c:495:2-3: Unneeded semicolon. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 15:29:30 -08:00
Junlin Yang	ad2c99ca75	net/mlx5: use kvfree() for memory allocated with kvzalloc() It is allocated with kvzalloc(), the corresponding release function should not be kfree(), use kvfree() instead. Generated by: scripts/coccinelle/api/kfree_mismatch.cocci Signed-off-by: Junlin Yang <yangjunlin@yulong.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 15:29:30 -08:00
Yevgeny Kliteynik	cc82a2e6c8	net/mlx5: DR, Add missing vhca_id consume from STEv1 The field source_eswitch_owner_vhca_id was not consumed in the same way as in STEv0. Added the missing set. Fixes: `10b6941864` ("net/mlx5: DR, Add HW STEv1 match logic") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 15:29:30 -08:00
Yevgeny Kliteynik	1412477882	net/mlx5: DR, Remove unneeded rx_decap_l3 function for STEv1 Remove the dr_ste_v1_set_rx_decap_l3 function that was replaced by another function - fixing a rebase error. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 15:29:29 -08:00
Yevgeny Kliteynik	0142f09764	net/mlx5: DR, Fixed typo in STE v0 "reforamt" -> "reformat" Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 15:29:29 -08:00
Jonathan Neuschäfer	bfdfe7fc1b	docs: networking: phy: Improve placement of parenthesis "either" is outside the parentheses, so the matching "or" should be too. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 12:29:11 -08:00
David S. Miller	5215206d8b	Merge branch 'tcp-delayed-completions' Eric Dumazet says: ==================== tcp: better deal with delayed TX completions Jakub and Neil reported an increase of RTO timers whenever TX completions are delayed a bit more (by increasing NIC TX coalescing parameters) While problems have been there forever, second patch might introduce some regressions so I prefer not backport them to stable releases before things settle. Many thanks to FB team for their help and tests. Few packetdrill tests need to be changed to reflect the improvements brought by this series. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 18:35:31 -08:00
Eric Dumazet	ac3959fd0d	tcp: remove obsolete check in __tcp_retransmit_skb() TSQ provides a nice way to avoid bufferbloat on individual socket, including retransmit packets. We can get rid of the old heuristic: /* Do not sent more than we queued. 1/4 is reserved for possible * copying overhead: fragmentation, tunneling, mangling etc. */ if (refcount_read(&sk->sk_wmem_alloc) > min_t(u32, sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2), sk->sk_sndbuf)) return -EAGAIN; This heuristic was giving false positives according to Jakub, whenever TX completions are delayed above RTT. (Ack packets are processed by TCP stack before clones are orphaned/freed) Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Jakub Kicinski <kuba@kernel.org> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 18:35:31 -08:00
Eric Dumazet	a7abf3cd76	tcp: consider using standard rtx logic in tcp_rcv_fastopen_synack() Jakub reported Data included in a Fastopen SYN that had to be retransmit would have to wait for an RTO if TX completions are slow, even with prior fix. This is because tcp_rcv_fastopen_synack() does not use standard rtx logic, meaning TSQ handler exits early in tcp_tsq_write() because tp->lost_out == tp->retrans_out Lets make tcp_rcv_fastopen_synack() use standard rtx logic, by using tcp_mark_skb_lost() on the skb thats needs to be sent again. Not this raised a warning in tcp_fastretrans_alert() during my tests since we consider the data not being aknowledged by the receiver does not mean packet was lost on the network. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Jakub Kicinski <kuba@kernel.org> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 18:35:31 -08:00
Eric Dumazet	f4dae54e48	tcp: plug skb_still_in_host_queue() to TSQ Jakub and Neil reported an increase of RTO timers whenever TX completions are delayed a bit more (by increasing NIC TX coalescing parameters) Main issue is that TCP stack has a logic preventing a packet being retransmit if the prior clone has not yet been orphaned or freed. This logic came with commit `1f3279ae0c` ("tcp: avoid retransmits of TCP packets hanging in host queues") Thankfully, in the case skb_still_in_host_queue() detects the initial clone is still in flight, it can use TSQ logic that will eventually retry later, at the moment the clone is freed or orphaned. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Neil Spring <ntspring@fb.com> Reported-by: Jakub Kicinski <kuba@kernel.org> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 18:35:31 -08:00
Tong Zhang	8176f8c0f0	isdn: remove extra spaces in the header file fix some coding style issues in the isdn header Signed-off-by: Tong Zhang <ztong0001@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 18:23:55 -08:00
Hoang Huu Le	97bc84bbd4	tipc: clean up warnings detected by sparse This patch fixes the following warning from sparse: net/tipc/monitor.c:263:35: warning: incorrect type in assignment (different base types) net/tipc/monitor.c:263:35: expected unsigned int net/tipc/monitor.c:263:35: got restricted __be32 [usertype] [...] net/tipc/node.c:374:13: warning: context imbalance in 'tipc_node_read_lock' - wrong count at exit net/tipc/node.c:379:13: warning: context imbalance in 'tipc_node_read_unlock' - unexpected unlock net/tipc/node.c:384:13: warning: context imbalance in 'tipc_node_write_lock' - wrong count at exit net/tipc/node.c:389:13: warning: context imbalance in 'tipc_node_write_unlock_fast' - unexpected unlock net/tipc/node.c:404:17: warning: context imbalance in 'tipc_node_write_unlock' - unexpected unlock [...] net/tipc/crypto.c:1201:9: warning: incorrect type in initializer (different address spaces) net/tipc/crypto.c:1201:9: expected struct tipc_aead [noderef] __rcu __tmp net/tipc/crypto.c:1201:9: got struct tipc_aead [...] Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 18:06:54 -08:00
Hoang Le	1980d37565	tipc: convert dest node's address to network order (struct tipc_link_info)->dest is in network order (__be32), so we must convert the value to network order before assigning. The problem detected by sparse: net/tipc/netlink_compat.c:699:24: warning: incorrect type in assignment (different base types) net/tipc/netlink_compat.c:699:24: expected restricted __be32 [usertype] dest net/tipc/netlink_compat.c:699:24: got int Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 18:06:54 -08:00
David S. Miller	1520929e26	Merge branch 'mlxsw-Implement-sampling-using-mirroring' Ido Schimmel says: ==================== mlxsw: Implement sampling using mirroring So far, sampling was implemented using a dedicated sampling mechanism that is available on all Spectrum ASICs. Spectrum-2 and later ASICs support sampling by mirroring packets to the CPU port with probability. This method has a couple of advantages compared to the legacy method: * Extra metadata per-packet: Egress port, egress traffic class, traffic class occupancy and end-to-end latency * Ability to sample packets on egress / per-flow as opposed to only ingress This series should not result in any user-visible changes and its aim is to convert Spectrum-2 and later ASICs to perform sampling by mirroring to the CPU port with probability. Future submissions will expose the additional metadata and enable sampling using more triggers (e.g., egress). Series overview: Patches #1-#3 extend the SPAN (mirroring) module to accept new parameters required for sampling. See individual commit messages for detailed explanation. Patch #4-#5 split sampling support between Spectrum-1 and later ASIC while still using the legacy method for all ASIC generations. Patch #6 converts Spectrum-2 and later ASICs to perform sampling by mirroring to the CPU port with probability. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:22:39 -08:00
Ido Schimmel	cf31190ae0	mlxsw: spectrum_matchall: Implement sampling using mirroring Spectrum-2 and later ASICs support sampling of packets by mirroring to the CPU with probability. There are several advantages compared to the legacy dedicated sampling mechanism: * Extra metadata per-packet: Egress port, egress traffic class, traffic class occupancy and end-to-end latency * Ability to sample packets on egress / per-flow Convert Spectrum-2 and later ASICs to perform sampling by mirroring to the CPU with probability. Subsequent patches will add support for egress / per-flow sampling and expose the extra metadata. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:22:39 -08:00
Ido Schimmel	34a277212c	mlxsw: spectrum_trap: Split sampling traps between ASICs Sampling of ingress packets is supported using a dedicated sampling mechanism on all Spectrum ASICs. However, Spectrum-2 and later ASICs support more sophisticated sampling by mirroring packets to the CPU. As a preparation for more advanced sampling configurations, split the trap configuration used for sampled packets between Spectrum-1 and later ASICs. This is needed since packets that are mirrored to the CPU are trapped via a different trap identifier compared to packets that are sampled using the dedicated sampling mechanism. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:22:39 -08:00
Ido Schimmel	20afb9bc48	mlxsw: spectrum_matchall: Split sampling support between ASICs Sampling of ingress packets is supported using a dedicated sampling mechanism on all Spectrum ASICs. However, Spectrum-2 and later ASICs support more sophisticated sampling by mirroring packets to the CPU. As a preparation for more advanced sampling configurations, split the sampling operations between Spectrum-1 and later ASICs. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:22:39 -08:00

1 2 3 4 5 ...

996698 Commits All Branches Search

996698 Commits

All Branches