linux

Commit Graph

Author	SHA1	Message	Date
Shay Agroskin	4cb4e98e5b	net/mlx5e: Added 'raw_errors_laneX' fields to ethtool statistics These are counters for errors received on rx side, such as FEC errors. Signed-off-by: Shay Agroskin <shayag@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-10-18 13:32:57 -07:00
Shay Agroskin	67daf11860	net/mlx5: Added "per_lane_error_counters" cap bit to PCAM Added "Per lane raw errors" capability bit in Ports Capabilities Mask (PCAM) enhanced features layout. This bit determines if the fields "phy_raw_errors_laneX" in "Physical Layer statistical" counters group are supported. Signed-off-by: Shay Agroskin <shayag@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-10-18 13:32:36 -07:00
Shay Agroskin	6cfa946050	net/mlx5e: Ethtool driver callback for query/set FEC policy Driver callback function for 'ethtool --show-fec', 'ethtool --set-fec' commands. The query function returns active and configured FEC policy for current link speed. The set function sets FEC policy for all supported link speeds. 1) If current link speed doesn't support requested FEC policy, the function fails. 2) If a different link speed doesn't support requested FEC policy, FEC capbilities for this speed are turned off. Signed-off-by: Shay Agroskin <shayag@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-10-18 13:13:31 -07:00
Shay Agroskin	2095b26414	net/mlx5e: Add port FEC get/set functions Added functions to query and set link FEC policy. To get/set FEC capabilities in PPLM reg we need to query current link speed. 'mlx5_get_fec_speed_field' queries current link speed and returns correct field offset. FEC Query's return value is divided into 'active FEC policy', which is the FEC policy used by the link, and 'configured FEC policy', which is the FEC policy requested by the user. The two values may differ if: 1) FEC policy was configured to 'auto', in which case the active FEC policy would be the default FEC policy for current link speed. 2) FEC policy was changed, but no link reset is performed. In which case, the active FEC policy would become the configured one after a link reset. FEC set function sets FEC policy for all link speeds and perform link reset. 1) If current link speed doesn't support requested FEC policy, the function fails. 2) If a different link speed doesn't support requested FEC policy, FEC capbilities for this speed are turned off and a warning message is printed. Signed-off-by: Shay Agroskin <shayag@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-10-18 13:13:31 -07:00
Shay Agroskin	4b5b9c7d97	net/mlx5: Add FEC fields to Port Phy Link Mode (PPLM) reg Added FEC related fields to PPLM layout. These fields are needed to set and query FEC policy for different link speeds. Signed-off-by: Shay Agroskin <shayag@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-10-18 13:13:31 -07:00
Vlad Buslov	2a4c429802	net/mlx5: Remove counter from idr after removing it from list Fs_counters list can temporary become unsorted when new counters are created/deleted concurrently. Idr is used to quickly lookup position to insert new counter in logarithmic time. However, if new flows are concurrently inserted during time window when flows with adjacent ids are already removed from idr but are still present in counters list, mlx5_fc_stats_work() observes counters list in inconsistent state, which results following warning: [ 1839.561955] mlx5_core 0000:81:00.0: mlx5_cmd_fc_bulk_get:587:(pid 729): Flow counter id (0x102d5) out of range (0x1c0a8..0x1c10b). Counter ignored. Move idr_remove() call to be executed synchronously with counter deletion from list. Extract this code to mlx5_fc_stats_remove() helper function that is called by workqueue job handler mlx5_fc_stats_work(). Fixes: `12d6066c3b` ("net/mlx5: Add flow counters idr") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com>	2018-10-18 13:13:31 -07:00
Vlad Buslov	fd33071303	net/mlx5: Take fs_counters dellist before addlist In fs_counters elements from both addlist and dellist are removed by mlx5_fc_stats_work() without any locking. This introduces race condition when batch of new rules is created and then immediately deleted (for example, when error occurred during flow creation). In such case some of the rules might be in dellist, but not in addlist when mlx5_fc_stats_work() is executed concurrently with tc, which will result rule deletion and use-after-free on next iteration because deleted rules are still in addlist. Always take dellist first to guarantee that rules can only be deleted after they were removed from addlist. Fixes: `6e5e228391` ("net/mlx5: Add new list to store deleted flow counters") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reported-by: Chris Mi <chrism@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com>	2018-10-18 13:13:31 -07:00
Tariq Toukan	4972e6fa3a	net/mlx5: Refactor fragmented buffer struct fields and init flow Take struct mlx5_frag_buf out of mlx5_frag_buf_ctrl, as it is not needed to manage and control the datapath of the fragmented buffers API. struct mlx5_frag_buf contains control info to manage the allocation and de-allocation of the fragmented buffer. Its fields are not relevant for datapath, so here I take them out of the struct mlx5_frag_buf_ctrl, except for the fragments array itself. In addition, modified mlx5_fill_fbc to initialise the frags pointers as well. This implies that the buffer must be allocated before the function is called. A set of type-specific *_get_byte_size() functions are replaced by a generic one. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-10-18 13:13:31 -07:00
David S. Miller	3a3295bfa6	Merge branch 'sctp-fix-sk_wmem_queued-and-use-it-to-check-for-writable-space' Xin Long says: ==================== sctp: fix sk_wmem_queued and use it to check for writable space sctp doesn't count and use asoc sndbuf_used, sk sk_wmem_alloc and sk_wmem_queued properly, which also causes some problem. This patchset is to improve it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-18 11:23:47 -07:00
Xin Long	cd305c74b0	sctp: use sk_wmem_queued to check for writable space sk->sk_wmem_queued is used to count the size of chunks in out queue while sk->sk_wmem_alloc is for counting the size of chunks has been sent. sctp is increasing both of them before enqueuing the chunks, and using sk->sk_wmem_alloc to check for writable space. However, sk_wmem_alloc is also increased by 1 for the skb allocked for sending in sctp_packet_transmit() but it will not wake up the waiters when sk_wmem_alloc is decreased in this skb's destructor. If msg size is equal to sk_sndbuf and sendmsg is waiting for sndbuf, the check 'msg_len <= sctp_wspace(asoc)' in sctp_wait_for_sndbuf() will keep waiting if there's a skb allocked in sctp_packet_transmit, and later even if this skb got freed, the waiting thread will never get waked up. This issue has been there since very beginning, so we change to use sk->sk_wmem_queued to check for writable space as sk_wmem_queued is not increased for the skb allocked for sending, also as TCP does. SOCK_SNDBUF_LOCK check is also removed here as it's for tx buf auto tuning which I will add in another patch. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-18 11:23:47 -07:00
Xin Long	605c0ac182	sctp: count both sk and asoc sndbuf with skb truesize and sctp_chunk size Now it's confusing that asoc sndbuf_used is doing memory accounting with SCTP_DATA_SNDSIZE(chunk) + sizeof(sk_buff) + sizeof(sctp_chunk) while sk sk_wmem_alloc is doing that with skb->truesize + sizeof(sctp_chunk). It also causes sctp_prsctp_prune to count with a wrong freed memory when sndbuf_policy is not set. To make this right and also keep consistent between asoc sndbuf_used, sk sk_wmem_alloc and sk_wmem_queued, use skb->truesize + sizeof(sctp_chunk) for them. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-18 11:23:47 -07:00
David S. Miller	2d0f0ca2c7	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 1GbE Intel Wired LAN Driver Updates 2018-10-17 This series adds support for the new igc driver. The igc driver is the new client driver supporting the Intel I225 Ethernet Controller, which supports 2.5GbE speeds. The reason for creating a new client driver, instead of adding support for the new device in e1000e, is that the silicon behaves more like devices supported in igb driver. It also did not make sense to add a client part, to the igb driver which supports only 1GbE server parts. This initial set of patches is designed for basic support (i.e. link and pass traffic). Follow-on patch series will add more advanced support like VLAN, Wake-on-LAN, etc.. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-18 10:27:20 -07:00
David S. Miller	99e9acd85c	mlx5-updates-2018-10-17 ======================================================================== From Or Gerlitz <ogerlitz@mellanox.com>: This series from Paul adds support to mlx5 e-switch tc offloading of multiple priorities and chains. This is made of four building blocks (along with few minor driver refactors): [1] Split FDB fast path prio to multiple namespaces Currently the FDB name-space contains two priorities, fast path (p0) and slow path (p1). The slow path contains the per representor SQ send-to-vport TX rule and the match-all RX miss rule. As a pre-step to support multi-chains and priorities, we split the FDB fast path to multiple namespaces (sub namespaces), each with multiple priorities. [2] E-Switch chains and priorities A chain is a group of priorities. We use the fdb parallel sub-namespaces to implement chains, and a flow table for each priority in them. Because these namespaces are parallel and in series to the slow path fdb, the chains aren't connected to each other (but to the slow path), and one must use a explicit goto action to reach a different chain. Flow tables for the priorities are created on demand and destroyed once not used. [3] Add a no-append flow insertion mode, use it for TC offloads Enhance the driver fs core, such that if a no-append flag is set by the caller, we add a new FTE, instead of appending the actions of the inserted rule when the same match already exists. For encap rules, we defer the HW offloading till we have a valid neighbor. This can result in the packet hitting a lower priority rule in the HW DP. Use the no-append API to push these packets to the slow path FDB table, so they go to the TC kernel DP as done before priorities where supported. [4] Offloading tc priorities and chains for eswitch flows Using [1], [2] and [3] above we add the support for offloading both chains and priorities. To get to a new chain, use the tc goto action. We support a fixed prio range 1-16, and chains 0-3. ============================================================================= -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJbx6k1AAoJEEg/ir3gV/o+l40H/14rNaV27vefjuALgOvNX4DY iSI5UFv9ILnAemcD2xkVfJeGolwdzoRhCXJ5oyCylCPnP4tb9zgDgwu9V/WmIRG+ DOaPLu+0V6jqfEGO5sXJPMhJNUR8WWAjfu66htJ0Nc1HV2OM5eYrcvjaYCfW4Egr QFWGyq4sPyYcpbb7wURbhmkfs8Vwxcj9c2cZIfXo3VJsKxULqU9Mj5hZnirI1OAy UhjLssb/8wfHmwNcqETI9ae7O+vPDMLkxdQvpviEBI+HJ7vZ6op2X4lVEsn/Bx2E /KrHGQObkwim8thTOYkQeJtqptWbiRvkpNnwryUV1fwjWPl6X1r3bXH7RdeRwCg= =aFCc -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2018-10-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2018-10-17 ======================================================================== From Or Gerlitz <ogerlitz@mellanox.com>: This series from Paul adds support to mlx5 e-switch tc offloading of multiple priorities and chains. This is made of four building blocks (along with few minor driver refactors): [1] Split FDB fast path prio to multiple namespaces Currently the FDB name-space contains two priorities, fast path (p0) and slow path (p1). The slow path contains the per representor SQ send-to-vport TX rule and the match-all RX miss rule. As a pre-step to support multi-chains and priorities, we split the FDB fast path to multiple namespaces (sub namespaces), each with multiple priorities. [2] E-Switch chains and priorities A chain is a group of priorities. We use the fdb parallel sub-namespaces to implement chains, and a flow table for each priority in them. Because these namespaces are parallel and in series to the slow path fdb, the chains aren't connected to each other (but to the slow path), and one must use a explicit goto action to reach a different chain. Flow tables for the priorities are created on demand and destroyed once not used. [3] Add a no-append flow insertion mode, use it for TC offloads Enhance the driver fs core, such that if a no-append flag is set by the caller, we add a new FTE, instead of appending the actions of the inserted rule when the same match already exists. For encap rules, we defer the HW offloading till we have a valid neighbor. This can result in the packet hitting a lower priority rule in the HW DP. Use the no-append API to push these packets to the slow path FDB table, so they go to the TC kernel DP as done before priorities where supported. [4] Offloading tc priorities and chains for eswitch flows Using [1], [2] and [3] above we add the support for offloading both chains and priorities. To get to a new chain, use the tc goto action. We support a fixed prio range 1-16, and chains 0-3. ============================================================================= Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-18 10:25:37 -07:00
David S. Miller	8f18da4721	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2018-10-18 1) Remove an unnecessary dev->tstats check in xfrmi_get_stats64. From Li RongQing. 2) We currently do a sizeof(element) instead of a sizeof(array) check when initializing the ovec array of the secpath. Currently this array can have only one element, so code is OK but error-prone. Change this to do a sizeof(array) check so that we can add more elements in future. From Li RongQing. 3) Improve xfrm IPv6 address hashing by using the complete IPv6 addresses for a hash. From Michal Kubecek. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-18 09:57:42 -07:00
Gustavo A. R. Silva	82385b0d2d	net: skbuff.h: Mark expected switch fall-throughs In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 22:31:30 -07:00
Arthur Kiyanovski	9fd255928d	net: ena: enable Low Latency Queues Use the new API to enable usage of LLQ. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 22:30:41 -07:00
Netanel Belgazal	8c590f9776	net: ena: Fix Kconfig dependency on X86 The Kconfig limitation of X86 is to too wide. The ENA driver only requires a little endian dependency. Change the dependency to be on little endian CPU. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 22:28:34 -07:00
David S. Miller	a58598a497	Merge branch 'tcp_bbr-TCP-BBR-changes-for-EDT-pacing-model' Neal Cardwell says: ==================== tcp_bbr: TCP BBR changes for EDT pacing model Two small patches for TCP BBR to follow up with Eric's recent work to change the TCP and fq pacing machinery to an "earliest departure time" (EDT) model: - The first patch adjusts the TCP BBR logic to work with the new "earliest departure time" (EDT) pacing model. - The second patch adjusts the TCP BBR logic to centralize the setting of gain values, to simplify the code and prepare for future changes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 22:22:54 -07:00
Neal Cardwell	cf33e25c0d	tcp_bbr: centralize code to set gains Centralize the code that sets gains used for computing cwnd and pacing rate. This simplifies the code and makes it easier to change the state machine or (in the future) dynamically change the gain values and ensure that the correct gain values are always used. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 22:22:53 -07:00
Neal Cardwell	a87c83d5ee	tcp_bbr: adjust TCP BBR for departure time pacing Adjust TCP BBR for the new departure time pacing model in the recent commit `ab408b6dc7` ("tcp: switch tcp and sch_fq to new earliest departure time model"). With TSQ and pacing at lower layers, there are often several skbs queued in the pacing layer, and thus there is less data "in the network" than "in flight". With departure time pacing at lower layers (e.g. fq or potential future NICs), the data in the pacing layer now has a pre-scheduled ("baked-in") departure time that cannot be changed, even if the congestion control algorithm decides to use a new pacing rate. This means that there can be a non-trivial lag between when BBR makes a pacing rate change and when the inter-skb pacing delays change. After a pacing rate change, the number of packets in the network can gradually evolve to be higher or lower, depending on whether the sending rate is higher or lower than the delivery rate. Thus ignoring this lag can cause significant overshoot, with the flow ending up with too many or too few packets in the network. This commit changes BBR to adapt its pacing rate based on the amount of data in the network that it estimates has already been "baked in" by previous departure time decisions. We estimate the number of our packets that will be in the network at the earliest departure time (EDT) for the next skb scheduled as: in_network_at_edt = inflight_at_edt - (EDT - now) * bw If we're increasing the amount of data in the network ("in_network"), then we want to know if the transmit of the EDT skb will push in_network above the target, so our answer includes bbr_tso_segs_goal() from the skb departing at EDT. If we're decreasing in_network, then we want to know if in_network will sink too low just before the EDT transmit, so our answer does not include the segments from the skb departing at EDT. Why do we treat pacing_gain > 1.0 case and pacing_gain < 1.0 case differently? The in_network curve is a step function: in_network goes up on transmits, and down on ACKs. To accurately predict when in_network will go beyond our target value, this will happen on different events, depending on whether we're concerned about in_network potentially going too high or too low: o if pushing in_network up (pacing_gain > 1.0), then in_network goes above target upon a transmit event o if pushing in_network down (pacing_gain < 1.0), then in_network goes below target upon an ACK event This commit changes the BBR state machine to use this estimated "packets in network" value to make its decisions. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 22:22:53 -07:00
Vijay Khemka	cb10c7c0df	net/ncsi: Add NCSI Broadcom OEM command This patch adds OEM Broadcom commands and response handling. It also defines OEM Get MAC Address handler to get and configure the device. ncsi_oem_gma_handler_bcm: This handler send NCSI broadcom command for getting mac address. ncsi_rsp_handler_oem_bcm: This handles response received for all broadcom OEM commands. ncsi_rsp_handler_oem_bcm_gma: This handles get mac address response and set it to device. Signed-off-by: Vijay Khemka <vijaykhemka@fb.com> Reviewed-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 22:14:54 -07:00
David S. Miller	1010c17ec5	Merge branch 'mscc-fixes' Gustavo A. R. Silva says: ==================== fix signedness bug and memory leak in mscc driver This patchset aims to fix a signedness bug in function vsc85xx_downshift_get() and a memory leak in function vsc8574_config_pre_init(). Changes in v3: - Add Quentin's Reviewed-by to commit log in patch 2/2. - Post the series to netdev. Changes in v2: - Add Quentin's Reviewed-by to commit log in patch 1/2. - Jump to out label so all functions in the driver exit with the PHY set to access the standard page. Thanks to Quentin Schulz for pointing this out. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 22:08:56 -07:00
Gustavo A. R. Silva	47d20212aa	net: phy: mscc: fix memory leak in vsc8574_config_pre_init In case memory resources for fw were successfully allocated, release them before return. Addresses-Coverity-ID: 1473968 ("Resource leak") Fixes: `00d70d8e0e` ("net: phy: mscc: add support for VSC8574 PHY") Reviewed-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 22:08:55 -07:00
Gustavo A. R. Silva	e519869af3	net: phy: mscc: fix signedness bug in vsc85xx_downshift_get Currently, the error handling for the call to function phy_read_paged() doesn't work because reg_val is of type u16 (16 bits, unsigned), which makes it impossible for it to hold a value less than 0. Fix this by changing the type of variable reg_val to int. Addresses-Coverity-ID: 1473970 ("Unsigned compared against 0") Fixes: `6a0bfbbe20` ("net: phy: mscc: migrate to phy_select/restore_page functions") Reviewed-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 22:08:55 -07:00
Kyeongdon Kim	33c4368ee2	net: fix warning in af_unix This fixes the "'hash' may be used uninitialized in this function" net/unix/af_unix.c:1041:20: warning: 'hash' may be used uninitialized in this function [-Wmaybe-uninitialized] addr->hash = hash ^ sk->sk_type; Signed-off-by: Kyeongdon Kim <kyeongdon.kim@lge.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:57:28 -07:00
Marek Behún	26422340da	net: dsa: mv88e6xxx: Fix 88E6141/6341 2500mbps SERDES speed This is a fix for the port_set_speed method for the Topaz family. Currently the same method is used as for the Peridot family, but this is wrong for the SERDES port. On Topaz, the SERDES port is port 5, not 9 and 10 as in Peridot. Moreover setting alt_bit on Topaz only makes sense for port 0 (for (differentiating 100mbps vs 200mbps). The SERDES port does not support more than 2500mbps, so alt_bit does not make any difference. Signed-off-by: Marek Behún <marek.behun@nic.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:56:15 -07:00
David S. Miller	e943d94e4b	Merge branch 'octeontx2-af-NPA-and-NIX-blocks-initialization' Sunil Goutham says: ==================== octeontx2-af: NPA and NIX blocks initialization This patchset is a continuation to earlier submitted patch series to add a new driver for Marvell's OcteonTX2 SOC's Resource virtualization unit (RVU) admin function driver. octeontx2-af: Add RVU Admin Function driver https://www.spinics.net/lists/netdev/msg528272.html This patch series adds logic for the following. - Modified register polling loop to use time_before(jiffies, timeout), as suggested by Arnd Bergmann. - Support to forward interface link status notifications sent by firmware to registered PFs mapped to a CGX::LMAC. - Support to set CGX LMAC in loopback mode, retrieve stats, configure DMAC filters at CGX level etc. - Network pool allocator (NPA) functional block initialization, admin queue support, NPALF aura/pool contexts memory allocation, init and deinit. - Network interface controller (NIX) functional block basic init, admin queue support, NIXLF RQ/CQ/SQ HW contexts memory allocation, init and deinit. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:43 -07:00
Geetha sowjanya	557dd485ea	octeontx2-af: Support for disabling NIX RQ/SQ/CQ contexts This patch adds support for a RVU PF/VF to disable all RQ/SQ/CQ contexts of a NIX LF via mbox. This will be used by PF/VF drivers upon teardown or while freeing up HW resources. A HW context which is not INIT'ed cannot be modified and a RVU PF/VF driver may or may not INIT all the RQ/SQ/CQ contexts. So a bitmap is introduced to keep track of enabled NIX RQ/SQ/CQ contexts, so that only enabled hw contexts are disabled upon LF teardown. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Stanislaw Kardach <skardach@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:43 -07:00
Sunil Goutham	ffb0abd7e9	octeontx2-af: NIX AQ instruction enqueue support Add support for a RVU PF/VF to submit instructions to NIX AQ via mbox. Instructions can be to init/write/read RQ/SQ/CQ/RSS contexts. In case of read, context will be returned as part of response to the mbox msg received. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:43 -07:00
Sunil Goutham	709a4f0c25	octeontx2-af: Alloc bitmaps for NIX Tx scheduler queues Allocate bitmaps and memory for PFVF mapping info for maintaining NIX transmit scheduler queues maintenance. PF/VF drivers will request for alloc, free e.t.c of Tx schedulers via mailbox. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:43 -07:00
Sunil Goutham	59360e9809	octeontx2-af: NIX LSO config for TSOv4/v6 offload Config LSO formats for TSOv4 and TSOv6 offloads. These formats tell HW which fields in the TCP packet's headers have to be updated while performing segmentation offload. Also report PF/VF drivers the LSO format indices as part of response to NIX_LF_ALLOC mbox msg. These indices are used in SQE extension headers while framing SQE for pkt transmission with TSO offload. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:43 -07:00
Sunil Goutham	cb30711a6c	octeontx2-af: NIX block LF initialization Upon receiving NIX_LF_ALLOC mbox message allocate memory for NIXLF's CQ, SQ, RQ, CINT, QINT and RSS HW contexts and configure respective base iova HW. Enable caching of contexts into NIX NDC. Return SQ buffer (SQB) size, this PF/VF MAC address etc info e.t.c to the mbox msg sender. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:43 -07:00
Sunil Goutham	aba53d5dbc	octeontx2-af: NIX block admin queue init Initialize NIX admin queue (AQ) i.e alloc memory for AQ instructions and for the results. All NIX LFs will submit instructions to AQ to init/write/read RQ/SQ/CQ/RSS contexts and in case of read, get context from result memory. Also before configuring/using NIX block calibrate X2P bus and check if NIX interfaces like CGX and LBK are in active and working state. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:43 -07:00
Geetha sowjanya	57856dde11	octeontx2-af: Support for disabling NPA Aura/Pool contexts This patch adds support for a RVU PF/VF to disable all Aura/Pool contexts of a NPA LF via mbox. This will be used by PF/VF drivers upon teardown or while freeing up HW resources. A HW context which is not INIT'ed cannot be modified and a RVU PF/VF driver may or may not INIT all the Aura/Pool contexts. So a bitmap is introduced to keep track of enabled NPA Aura/Pool contexts, so that only enabled hw contexts are disabled upon LF teardown. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Stanislaw Kardach <skardach@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:43 -07:00
Sunil Goutham	4a3581cd59	octeontx2-af: NPA AQ instruction enqueue support Add support for a RVU PF/VF to submit instructions to NPA AQ via mbox. Instructions can be to init/write/read Aura/Pool/Qint contexts. In case of read, context will be returned as part of response to the mbox msg received. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:42 -07:00
Sunil Goutham	3fa4c3232a	octeontx2-af: NPA block LF initialization Upon receiving NPA_LF_ALLOC mbox message allocate memory for NPALF's aura, pool and qint contexts and configure the same to HW. Enable caching of contexts into NPA NDC. Return pool related info like stack size, num pointers per stack page e.t.c to the mbox msg sender. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:42 -07:00
Sunil Goutham	7a37245ef2	octeontx2-af: NPA block admin queue init Initialize NPA admin queue (AQ) i.e alloc memory for AQ instructions and for the results. All NPA LFs will submit instructions to AQ to init/write/read Aura/Pool contexts and in case of read, get context from result memory. Added some common APIs for allocating memory for a queue and get IOVA in return, these APIs will be used by NIX AQ and for other purposes. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:42 -07:00
Geetha sowjanya	23999b30ae	octeontx2-af: Enable or disable CGX internal loopback Add support to enable or disable internal loopback mode in CGX. New mbox IDs CGX_INTLBK_ENABLE/DISABLE added for this. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:42 -07:00
Linu Cherian	61071a871e	octeontx2-af: Forward CGX link notifications to PFs Upon receiving notification from firmware the CGX event handler in the AF driver gets the current link info such as status, speed, duplex etc from CGX driver and sends it across to PFs who have registered to receive such notifications. To support above - Mbox messaging support for sending msgs from AF to PF has been added. - Added mbox msgs so that PFs can register/unregister for link events. - Link notifications are sent to PF under two scenarioss. 1. When a asynchronous link change notification is received from firmware with notification flag turned on for that PF. 2. Upon notification turn on request, the current link status is send to the PF. Also added a new mailbox msg using which RVU PF/VF can retrieve their mapped CGX LMAC's current link info. Link info includes status, speed, duplex and lmac type. Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:42 -07:00
Vidhya Raman	96be2e0da8	octeontx2-af: Support for MAC address filters in CGX This patch adds support for setting MAC address filters in CGX for PF interfaces. Also PF interfaces can be put in promiscuous mode. Dataplane PFs access this functionality using mailbox messages to the AF driver. Signed-off-by: Vidhya Raman <vraman@marvell.com> Signed-off-by: Stanislaw Kardach <skardach@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:42 -07:00
Christina Jacob	66208910e5	octeontx2-af: Support to retrieve CGX LMAC stats This patch adds support for a RVU PF/VF driver to retrieve it's mapped CGX LMAC Rx and Tx stats from AF via mbox. New mailbox msg is added is added. Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:42 -07:00
Sunil Goutham	1435f66a28	octeontx2-af: CGX Rx/Tx enable/disable mbox handlers Added new mailbox msgs for RVU PF/VFs to request AF to enable/disable their mapped CGX::LMAC Rx & Tx. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:42 -07:00
Sunil Goutham	6ca3ee2f7d	octeontx2-af: Improve register polling loop Instead of looping on a integer timeout, use time_before(jiffies), so that maximum poll time is capped. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 21:33:42 -07:00
David S. Miller	53e50a6ec2	Merge branch 'mlxsw-Add-VxLAN-support' Ido Schimmel says: ==================== mlxsw: Add VxLAN support This patchset adds support for VxLAN offload in the mlxsw driver. With regards to the forwarding plane, VxLAN support is composed from two main parts: Encapsulation and decapsulation. In the device, NVE encapsulation (and VxLAN in particular) takes place in the bridge. A packet can be encapsulated using VxLAN either because it hit an FDB entry that forwards it to the router with the IP of the remote VTEP or because it was flooded, in which case it is sent to a list of remote VTEPs (in addition to local ports). In either case, the VNI is derived from the filtering identifier (FID) the packet was classified to at ingress and the underlay source IP is taken from a device global configuration. VxLAN decapsulation takes place in the underlay router, where packets that hit a local route that corresponds to the source IP of the local VTEP are decapsulated and injected to the bridge. The packets are classified to a FID based on the VNI they came with. The first six patches export the required APIs in the VxLAN and mlxsw drivers in order to allow for the introduction of the NVE core in the next two patches. The NVE core is designed to support a variety of NVE encapsulations (e.g., VxLAN, NVGRE) and different ASICs, but currently only VxLAN and Spectrum are supported. Spectrum-2 support will be added in the future. The last 10 patches add support for VxLAN decapsulation and encapsulation and include the addition of the required switchdev APIs in the VxLAN driver. These APIs allow capable drivers to get a notification about the addition / deletion of FDB entries to / from the VxLAN's FDB. Subsequent patchset will add selftests (generic and mlxsw-specific), data plane learning, FDB extack and vetoing and support for VLAN-aware bridges (one VNI per VxLAN device model). v2: * Implement netif_is_vxlan() using rtnl_link_ops->kind (Jakub & Stephen) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 17:45:08 -07:00
Ido Schimmel	1231e04f5b	mlxsw: spectrum_switchdev: Add support for VxLAN encapsulation In the device, VxLAN encapsulation takes place in the FDB table where certain {MAC, FID} entries are programmed with an underlay unicast IP. MAC addresses that are not programmed in the FDB are flooded to the relevant local ports and also to a list of underlay unicast IPs that are programmed using the all zeros MAC address in the VxLAN driver. One difference between the hardware and software data paths is the fact that in the software data path there are two FDB lookups prior to the encapsulation of the packet. First in the bridge's FDB table using {MAC, VID} and another in the VxLAN's FDB table using {MAC, VNI}. Therefore, when a new VxLAN FDB entry is notified, it is only programmed to the device if there is a corresponding entry in the bridge's FDB table. Similarly, when a new bridge FDB entry pointing to the VxLAN device is notified, it is only programmed to the device if there is a corresponding entry in the VxLAN's FDB table. Note that the above scheme will result in a discrepancy between both data paths if only one FDB table is populated in the software data path. For example, if only the bridge's FDB is populated with an entry pointing to a VxLAN device, then a packet hitting the entry will only be flooded by the kernel to remote VTEPs whereas the device will also flood the packets to other local ports member in the VLAN. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 17:45:08 -07:00
Ido Schimmel	1c30d1836a	mlxsw: spectrum: Enable VxLAN enslavement to bridges Enslavement of VxLAN devices to offloaded bridges was never forbidden by mlxsw, but this patch makes sure the required configuration is performed in order to allow VxLAN encapsulation and decapsulation to take place in the device. The patch handles both the case where a VxLAN device is enslaved to an already offloaded bridge and the case where the first mlxsw port is enslaved to a bridge that already has VxLAN device configured. Invalid configurations are sanitized and an error string is returned via extack. Since encapsulation and decapsulation do not occur when the VxLAN device is down, the driver makes sure to enable / disable these functionalities based on NETDEV_PRE_UP and NETDEV_DOWN events. Note that NETDEV_PRE_UP is used in favor of NETDEV_UP, as the former allows to veto the operation, if necessary. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 17:45:08 -07:00
Ido Schimmel	e9ba0fbc7d	bridge: switchdev: Allow clearing FDB entry offload indication Currently, an FDB entry only ceases being offloaded when it is deleted. This changes with VxLAN encapsulation. Devices capable of performing VxLAN encapsulation usually have only one FDB table, unlike the software data path which has two - one in the bridge driver and another in the VxLAN driver. Therefore, bridge FDB entries pointing to a VxLAN device are only offloaded if there is a corresponding entry in the VxLAN FDB. Allow clearing the offload indication in case the corresponding entry was deleted from the VxLAN FDB. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 17:45:08 -07:00
Petr Machata	045a5a9914	vxlan: Notify for each remote of a removed FDB entry When notifications are sent about FDB activity, and an FDB entry with several remotes is removed, the notification is sent only for the first destination. That makes it impossible to distinguish between the case where only this first remote is removed, and the one where the FDB entry is removed as a whole. Therefore send one notification for each remote of a removed FDB entry. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 17:45:08 -07:00
Petr Machata	0efe117333	vxlan: Support marking RDSTs as offloaded Offloaded bridge FDB entries are marked with NTF_OFFLOADED. Implement a similar mechanism for VXLAN, where a given remote destination can be marked as offloaded. To that end, introduce a new event, SWITCHDEV_VXLAN_FDB_OFFLOADED, through which the marking is communicated to the vxlan driver. To identify which RDST should be marked as offloaded, an switchdev_notifier_vxlan_fdb_info is passed to the listeners. The "offloaded" flag in that object determines whether the offloaded mark should be set or cleared. When sending offloaded FDB entries over netlink, mark them with NTF_OFFLOADED. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 17:45:08 -07:00
Petr Machata	1941f1d645	vxlan: Add vxlan_fdb_find_uc() for FDB querying A switchdev-capable driver that is aware of VXLAN may need to query VXLAN FDB. In the particular case of mlxsw, this functionality is limited to querying UC FDBs. Those being easier to deal with than the general case of RDST chain traversal, introduce an interface to query specifically UC FDBs: vxlan_fdb_find_uc(). Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-17 17:45:08 -07:00

1 2 3 4 5 ...

785686 Commits All Branches Search

785686 Commits

All Branches