linux

Commit Graph

Author	SHA1	Message	Date
Sudarsana Reddy Kalluru	07f12622a6	bnx2x: Enable PTP only on the PF that initializes the port There will be only one PHC clock per port. PTP should be enabled only on one PF per port. The change enables PTP functionality on the PF that initializes the port. The change is useful in multi-function modes e.g., NPAR where a port can have more than one PF. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 16:25:14 -08:00
Eric Dumazet	55469bc6b5	drivers: net: remove <net/busy_poll.h> inclusion when not needed Drivers using generic NAPI interface no longer need to include <net/busy_poll.h>, since busy polling was moved to core networking stack long ago. See commit `79e7fff47b` ("net: remove support for per driver ndo_busy_poll()") for reference. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-25 16:20:02 -07:00
Alexander Duyck	8ec56fc3c5	net: allow fallback function to pass netdev For most of these calls we can just pass NULL through to the fallback function as the sb_dev. The only cases where we cannot are the cases where we might be dealing with either an upper device or a driver that would have configured things to support an sb_dev itself. The only driver that has any significant change in this patch set should be ixgbe as we can drop the redundant functionality that existed in both the ndo_select_queue function and the fallback function that was passed through to us. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-07-09 13:57:25 -07:00
Alexander Duyck	4f49dec907	net: allow ndo_select_queue to pass netdev This patch makes it so that instead of passing a void pointer as the accel_priv we instead pass a net_device pointer as sb_dev. Making this change allows us to pass the subordinate device through to the fallback function eventually so that we can keep the actual code in the ndo_select_queue call as focused on possible on the exception cases. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-07-09 13:41:34 -07:00
Sudarsana Reddy Kalluru	484c016d93	bnx2x: Fix receiving tx-timeout in error or recovery state. Driver performs the internal reload when it receives tx-timeout event from the OS. Internal reload might fail in some scenarios e.g., fatal HW issues. In such cases OS still see the link, which would result in undesirable functionalities such as re-generation of tx-timeouts. The patch addresses this issue by indicating the link-down to OS when tx-timeout is detected, and keeping the link in down state till the internal reload is successful. Please consider applying it to 'net' branch. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-30 18:48:02 +09:00
Sudarsana Reddy Kalluru	1b40428c04	bnx2x: Collect the device debug information during Tx timeout. Tx-timeout mostly happens due to some issue in the device. In such cases, debug dump would be helpful for identifying the cause of the issue. This patch adds support to spill debug data during the Tx timeout. Here bnx2x_panic_dump() API is used instead of bnx2x_panic(), since we still want to allow the Tx-timeout recovery a chance to succeed. Changes from previous version: ------------------------------- v2: Fixed a coding error. Please consider applying this to "net-next". Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-28 22:52:52 -04:00
Sinan Kaya	7f883c774e	bnx2x: Eliminate duplicate barriers on weakly-ordered archs Code includes wmb() followed by writel(). writel() already has a barrier on some architectures like arm64. This ends up CPU observing two barriers back to back before executing the register write. Since code already has an explicit barrier call, changing writel() to writel_relaxed(). Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-26 12:47:55 -04:00
Sinan Kaya	edd874235a	bnx2x: Replace doorbell barrier() with wmb() barrier() doesn't guarantee memory writes to be observed by the hardware on all architectures. barrier() only tells compiler not to move this code with respect to other read/writes. If memory write needs to be observed by the HW, wmb() is the right choice. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-26 12:47:55 -04:00
Gal Pressman	37ed41c423	bnx2x: Replace WARN_ONCE with netdev_WARN_ONCE Use the more appropriate netdev_WARN_ONCE instead of WARN_ONCE macro. Signed-off-by: Gal Pressman <galp@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Cc: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-08 20:53:14 -05:00
David S. Miller	6bb8824732	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net net/ipv6/ip6_gre.c is a case of parallel adds. include/trace/events/tcp.h is a little bit more tricky. The removal of in-trace-macro ifdefs in 'net' paralleled with moving show_tcp_state_name and friends over to include/trace/events/sock.h in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-29 15:42:26 -05:00
Guilherme G. Piccoli	f7084059a9	bnx2x: Improve reliability in case of nested PCI errors While in recovery process of PCI error (called EEH on PowerPC arch), another PCI transaction could be corrupted causing a situation of nested PCI errors. Also, this scenario could be reproduced with error injection mechanisms (for debug purposes). We observe that in case of nested PCI errors, bnx2x might attempt to initialize its shmem and cause a kernel crash due to bad addresses read from MCP. Multiple different stack traces were observed depending on the point the second PCI error happens. This patch avoids the crashes by: * failing PCI recovery in case of nested errors (since multiple PCI errors in a row are not expected to lead to a functional adapter anyway), and by, * preventing access to adapter FW when MCP is failed (we mark it as failed when shmem cannot get initialized properly). Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Acked-by: Shahed Shaikh <Shahed.Shaikh@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-27 12:13:32 -05:00
Michael Chan	3c3def5fc6	bnx2x: Use NETIF_F_GRO_HW. Advertise NETIF_F_GRO_HW and turn on TPA_MODE_GRO when NETIF_F_GRO_HW is set. Disable NETIF_F_GRO_HW in bnx2x_fix_features() if the MTU does not support TPA_MODE_GRO or GRO is not set. bnx2x_change_mtu() also needs to disable NETIF_F_GRO_HW if the MTU does not support it. Original parameter disable_tpa will continue to disable LRO and GRO_HW. Preserve the original behavior of enabling LRO by default. User has to run ethtool -K to explicitly enable GRO_HW. Cc: Ariel Elior <Ariel.Elior@cavium.com> Cc: everest-linux-l2@cavium.com Signed-off-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-19 10:38:37 -05:00
Nogah Frankel	575ed7d39e	net_sch: mqprio: Change TC_SETUP_MQPRIO to TC_SETUP_QDISC_MQPRIO Change TC_SETUP_MQPRIO to TC_SETUP_QDISC_MQPRIO to match the new convention. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-08 12:23:38 +09:00
Jiri Pirko	de4784ca03	net: sched: get rid of struct tc_to_netdev Get rid of struct tc_to_netdev which is now just unnecessary container and rather pass per-type structures down to drivers directly. Along with that, consolidate the naming of per-type structure variables in cls_*. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:37 -07:00
Jiri Pirko	38cf0426e5	net: sched: change return value of ndo_setup_tc for driver supporting mqprio only Change the return value from -EINVAL to -EOPNOTSUPP. The rest of the drivers have it like that, so be aligned. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:37 -07:00
Jiri Pirko	5fd9fc4e20	net: sched: push cls related args into cls_common structure As ndo_setup_tc is generic offload op for whole tc subsystem, does not really make sense to have cls-specific args. So move them under cls_common structurure which is embedded in all cls structs. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:37 -07:00
Jiri Pirko	2572ac53c4	net: sched: make type an argument for ndo_setup_tc Since the type is always present, push it to be a separate argument to ndo_setup_tc. On the way, name the type enum and use it for arg type. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:35 -07:00
David S. Miller	0ddead90b2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The conflicts were two cases of overlapping changes in batman-adv and the qed driver. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-15 11:59:32 -04:00
Mintz, Yuval	92f85f05ca	bnx2x: Allow vfs to disable txvlan offload VF clients are configured as enforced, meaning firmware is validating the correctness of their ethertype/vid during transmission. Once txvlan is disabled, VF would start getting SKBs for transmission here vlan is on the payload - but it'll pass the packet's ethertype instead of the vid, leading to firmware declaring it as malicious. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-10 16:02:54 -04:00
Jiri Pirko	a5fcf8a6c9	net: propagate tc filter chain index down the ndo_setup_tc call We need to push the chain index down to the drivers, so they have the information to which chain the rule belongs. For now, no driver supports multichain offload, so only chain 0 is supported. This is needed to prevent chain squashes during offload for now. Later this will be used to implement multichain offload. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-08 09:55:53 -04:00
Mintz, Yuval	3968d38917	bnx2x: Fix Multi-Cos Apparently multi-cos isn't working for bnx2x quite some time - driver implements ndo_select_queue() to allow queue-selection for FCoE, but the regular L2 flow would cause it to modulo the fallback's result by the number of queues. The fallback would return a queue matching the needed tc [via __skb_tx_hash()], but since the modulo is by the number of TSS queues where number of TCs is not accounted, transmission would always be done by a queue configured into using TC0. Fixes: `ada7c19e6d` ("bnx2x: use XPS if possible for bnx2x_select_queue instead of pure hash") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-01 12:22:02 -04:00
Scott Wood	9b70de6d02	bnx2x: Align RX buffers The bnx2x driver is not providing proper alignment on the receive buffers it passes to build_skb(), causing skb_shared_info to be misaligned. skb_shared_info contains an atomic, and while PPC normally supports unaligned accesses, it does not support unaligned atomics. Aligning the size of rx buffers will ensure that page_frag_alloc() returns aligned addresses. This can be reproduced on PPC by setting the network MTU to 1450 (or other non-multiple-of-4) and then generating sufficient inbound network traffic (one or two large "wget"s usually does it), producing the following oops: Unable to handle kernel paging request for unaligned access at address 0xc00000ffc43af656 Faulting instruction address: 0xc00000000080ef8c Oops: Kernel access of bad area, sig: 7 [#1] SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: vmx_crypto powernv_rng rng_core powernv_op_panel leds_powernv led_class nfsd ip_tables x_tables autofs4 xfs lpfc bnx2x mdio libcrc32c crc_t10dif crct10dif_generic crct10dif_common CPU: 104 PID: 0 Comm: swapper/104 Not tainted 4.11.0-rc8-00088-g4c761da #2 task: c00000ffd4892400 task.stack: c00000ffd4920000 NIP: c00000000080ef8c LR: c00000000080eee8 CTR: c0000000001f8320 REGS: c00000ffffc33710 TRAP: 0600 Not tainted (4.11.0-rc8-00088-g4c761da) MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24082042 XER: 00000000 CFAR: c00000000080eea0 DAR: c00000ffc43af656 DSISR: 00000000 SOFTE: 1 GPR00: c000000000907f64 c00000ffffc33990 c000000000dd3b00 c00000ffcaf22100 GPR04: c00000ffcaf22e00 0000000000000000 0000000000000000 0000000000000000 GPR08: 0000000000b80008 c00000ffc43af636 c00000ffc43af656 0000000000000000 GPR12: c0000000001f6f00 c00000000fe1a000 000000000000049f 000000000000c51f GPR16: 00000000ffffef33 0000000000000000 0000000000008a43 0000000000000001 GPR20: c00000ffc58a90c0 0000000000000000 000000000000dd86 0000000000000000 GPR24: c000007fd0ed10c0 00000000ffffffff 0000000000000158 000000000000014a GPR28: c00000ffc43af010 c00000ffc9144000 c00000ffcaf22e00 c00000ffcaf22100 NIP [c00000000080ef8c] __skb_clone+0xdc/0x140 LR [c00000000080eee8] __skb_clone+0x38/0x140 Call Trace: [c00000ffffc33990] [c00000000080fb74] skb_clone+0x74/0x110 (unreliable) [c00000ffffc339c0] [c000000000907f64] packet_rcv+0x144/0x510 [c00000ffffc33a40] [c000000000827b64] __netif_receive_skb_core+0x5b4/0xd80 [c00000ffffc33b00] [c00000000082b2bc] netif_receive_skb_internal+0x2c/0xc0 [c00000ffffc33b40] [c00000000082c49c] napi_gro_receive+0x11c/0x260 [c00000ffffc33b80] [d000000066483d68] bnx2x_poll+0xcf8/0x17b0 [bnx2x] [c00000ffffc33d00] [c00000000082babc] net_rx_action+0x31c/0x480 [c00000ffffc33e10] [c0000000000d5a44] __do_softirq+0x164/0x3d0 [c00000ffffc33f00] [c0000000000d60a8] irq_exit+0x108/0x120 [c00000ffffc33f20] [c000000000015b98] __do_irq+0x98/0x200 [c00000ffffc33f90] [c000000000027f14] call_do_irq+0x14/0x24 [c00000ffd4923a90] [c000000000015d94] do_IRQ+0x94/0x110 [c00000ffd4923ae0] [c000000000008d90] hardware_interrupt_common+0x150/0x160 Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 22:50:19 -04:00
Amritha Nambiar	56f36acd21	mqprio: Modify mqprio to pass user parameters via ndo_setup_tc. The configurable priority to traffic class mapping and the user specified queue ranges are used to configure the traffic class, overriding the hardware defaults when the 'hw' option is set to 0. However, when the 'hw' option is non-zero, the hardware QOS defaults are used. This patch makes it so that we can pass the data the user provided to ndo_setup_tc. This allows us to pull in the queue configuration if the user requested it as well as any additional hardware offload type requested by using a value other than 1 for the hw value. Finally it also provides a means for the device driver to return the level supported for the offload type via the qopt->hw value. Previously we were just always assuming the value to be 1, in the future values beyond just 1 may be supported. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-15 15:20:27 -07:00
Eric Dumazet	6ad20165d3	drivers: net: generalize napi_complete_done() napi_complete_done() allows to opt-in for gro_flush_timeout, added back in linux-3.19, commit `3b47d30396` ("net: gro: add a per device gro flush timer") This allows for more efficient GRO aggregation without sacrifying latencies. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-30 15:10:42 -05:00
Eric Dumazet	b9032741e4	bnx2x: avoid two atomic ops per page on x86 Commit `4cace675d6` ("bnx2x: Alloc 4k fragment for each rx ring buffer element") added extra put_page() and get_page() calls on arches where PAGE_SIZE=4K like x86 Reorder things to avoid this overhead. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Cc: Yuval Mintz <Yuval.Mintz@cavium.com> Cc: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-23 11:16:27 -05:00
Zhang Shengju	0e24c0ad2b	bnx2x: use reset to set network header Since offset is zero, it's not necessary to use set function. Reset function is straightforward, and will remove the unnecessary add operation in set function. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 15:49:16 -05:00
Eric Dumazet	80f1c21c53	bnx2x: switch to napi_complete_done() Switch from napi_complete() to napi_complete_done() for better GRO support (gro_flush_timeout) and core NAPI features. Do not rearm interrupts if we are busy polling, to reduce bus and interrupts overhead. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Adam Belay <abelay@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Yuval Mintz <Yuval.Mintz@cavium.com> Cc: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-16 13:40:58 -05:00
Jarod Wilson	e1c6dccaf3	ethernet/broadcom: use core min/max MTU checking tg3: min_mtu 60, max_mtu 9000/1500 bnxt: min_mtu 60, max_mtu 9000 bnx2x: min_mtu 46, max_mtu 9600 - Fix up ETH_OVREHEAD -> ETH_OVERHEAD while we're in here, remove duplicated defines from bnx2x_link.c. bnx2: min_mtu 46, max_mtu 9000 - Use more standard ETH_* defines while we're at it. bcm63xx_enet: min_mtu 46, max_mtu 2028 - compute_hw_mtu was made largely pointless, and thus merged back into bcm_enet_change_mtu. b44: min_mtu 60, max_mtu 1500 CC: netdev@vger.kernel.org CC: Michael Chan <michael.chan@broadcom.com> CC: Sony Chacko <sony.chacko@qlogic.com> CC: Ariel Elior <ariel.elior@qlogic.com> CC: Dept-HSGLinuxNICDev@qlogic.com CC: Siva Reddy Kallam <siva.kallam@broadcom.com> CC: Prashant Sreedharan <prashant@broadcom.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-18 11:34:18 -04:00
Yuval Mintz	d78a1f0845	bnx2x: don't wait for Tx completion on recovery When driver has hit a parity event, HW can no longer write to host memory. As a result, Tx completions cannot be written to the host SB memory, and waiting for Tx completions eventually timeout. As driver is willing to delay as much as 1-2 seconds per Tx queue for its draining and this delay is sequential, the time to recover might greatly lengthen needlessly in case the recovery is done under multi-connection traffic. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-16 19:23:35 -04:00
John Fastabend	5eb4dce3b3	net: relax setup_tc ndo op handle restriction I added this check in setup_tc to multiple drivers, if (handle != TC_H_ROOT \|\| tc->type != TC_SETUP_MQPRIO) Unfortunately restricting to TC_H_ROOT like this breaks the old instantiation of mqprio to setup a hardware qdisc. This patch relaxes the test to only check the type to make it equivalent to the check before I broke it. With this the old instantiation continues to work. A good smoke test is to setup mqprio with, # tc qdisc add dev eth4 root mqprio num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 0@0 1@1 2@2 3@3 4@4 5@5 6@6 7@7 Fixes: `e4c6734eaa` ("net: rework ndo tc op to consume additional qdisc handle paramete") Reported-by: Singh Krishneil <krishneil.k.singh@intel.com> Reported-by: Jake Keller <jacob.e.keller@intel.com> CC: Murali Karicheri <m-karicheri2@ti.com> CC: Shradha Shah <sshah@solarflare.com> CC: Or Gerlitz <ogerlitz@mellanox.com> CC: Ariel Elior <ariel.elior@qlogic.com> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Bruce Allan <bruce.w.allan@intel.com> CC: Jesse Brandeburg <jesse.brandeburg@intel.com> CC: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-03 16:25:15 -05:00
John Fastabend	16e5cc6471	net: rework setup_tc ndo op to consume general tc operand This patch updates setup_tc so we can pass additional parameters into the ndo op in a generic way. To do this we provide structured union and type flag. This lets each classifier and qdisc provide its own set of attributes without having to add new ndo ops or grow the signature of the callback. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-17 09:47:35 -05:00
John Fastabend	e4c6734eaa	net: rework ndo tc op to consume additional qdisc handle parameter The ndo_setup_tc() op was added to support drivers offloading tx qdiscs however only support for mqprio was ever added. So we only ever added support for passing the number of traffic classes to the driver. This patch generalizes the ndo_setup_tc op so that a handle can be provided to indicate if the offload is for ingress or egress or potentially even child qdiscs. CC: Murali Karicheri <m-karicheri2@ti.com> CC: Shradha Shah <sshah@solarflare.com> CC: Or Gerlitz <ogerlitz@mellanox.com> CC: Ariel Elior <ariel.elior@qlogic.com> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Bruce Allan <bruce.w.allan@intel.com> CC: Jesse Brandeburg <jesse.brandeburg@intel.com> CC: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-17 09:47:35 -05:00
Yuval Mintz	445204644b	bnx2x: Remove unneccessary EXPORT_SYMBOL bnx2x_schedule_sp_rtnl is exported by bnx2x, although no other module uses it. Reported-by: Benjamin Poirier <bpoirier@suse.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-16 20:12:16 -05:00
David S. Miller	c07f30ad68	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2015-12-31 18:20:10 -05:00
Yuval Mintz	ea2465af3b	bnx2x: Prevent FW assertion when using Vxlan FW has a rare corner case in which a fragmented packet using lots of frags would not be linearized, causing the FW to assert while trying to transmit the packet. To prevent this, we need to make sure the window of fragements containing MSS worth of data contains 1 BD less than for regular packets due to the additional parsing BD. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-18 16:34:32 -05:00
Eric Dumazet	5abe255877	bnx2x: remove rx_pkt/rx_calls These fields are updated but never read. Remove the overhead. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-08 22:59:22 -05:00
Eric Dumazet	4d6acb62d2	bnx2x: avoid soft lockup in bnx2x_poll() Under heavy TX load, bnx2x_poll() can loop forever and trigger soft lockup bugs. A napi poll handler must yield after one TX completion round, risk of livelock is too high otherwise. Bug is very easy to trigger using a debug build, and udp flood, because of added cpu cycles in TX completion, and we do not receive enough packets to break the loop. Reported-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ariel Elior <ariel.elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-08 22:54:57 -05:00
Michal Schmidt	9adab1b036	bnx2x: change FW GRO error message to WARN_ONCE It's supposed to be impossible for TPA to give us anything else than IPv4 or IPv6 here. But in case there is a way to reach this error by some strange received frames, we don't want to flood the kernel log. WARN_ONCE is better for this purpose. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-05 19:00:02 -05:00
Michal Schmidt	5c9ffde4a0	bnx2x: drop redundant error message about allocation failure alloc_pages() already prints a warning when it fails. No need to emit another message. Certainly not at KERN_ERR level, because it is no big deal if this GFP_ATOMIC allocation fails occasionally. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-05 19:00:02 -05:00
Eric Dumazet	93d05d4a32	net: provide generic busy polling to all NAPI drivers NAPI drivers no longer need to observe a particular protocol to benefit from busy polling (CONFIG_NET_RX_BUSY_POLL=y) napi_hash_add() and napi_hash_del() are automatically called from core networking stack, respectively from netif_napi_add() and netif_napi_del() This patch depends on free_netdev() and netif_napi_del() being called from process context, which seems to be the norm. Drivers might still prefer to call napi_hash_del() on their own, since they might combine all the rcu grace periods into a single one, knowing their NAPI structures lifetime, while core networking stack has no idea of a possible combining. Once this patch proves to not bring serious regressions, we will cleanup drivers to either remove napi_hash_del() or provide appropriate rcu grace periods combining. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:42 -05:00
Eric Dumazet	93f93a4404	net: move skb_mark_napi_id() into core networking stack We would like to automatically provide busy polling support to all NAPI drivers, without them having to implement anything. skb_mark_napi_id() can be called from napi_gro_receive() and napi_get_frags(). Few drivers are still calling skb_mark_napi_id() because they use netif_receive_skb(). They should eventually call napi_gro_receive() instead. I will leave this to drivers maintainers. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:41 -05:00
Eric Dumazet	b59768c6b4	bnx2x: remove bnx2x_low_latency_recv() support Switch to native NAPI polling, as this reduces overhead and complexity. Normal path is faster, since one cmpxchg() is not anymore requested, and busy polling with the NAPI polling has same performance. Tested: lpk50:~# cat /proc/sys/net/core/busy_read 70 lpk50:~# nstat >/dev/null;./netperf -H lpk55 -t TCP_RR;nstat MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpk55.prod.google.com () port 0 AF_INET : first burst 0 Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec 16384 87380 1 1 10.00 40095.07 16384 87380 IpInReceives 401062 0.0 IpInDelivers 401062 0.0 IpOutRequests 401079 0.0 TcpActiveOpens 7 0.0 TcpPassiveOpens 3 0.0 TcpAttemptFails 3 0.0 TcpEstabResets 5 0.0 TcpInSegs 401036 0.0 TcpOutSegs 401052 0.0 TcpOutRsts 38 0.0 UdpInDatagrams 26 0.0 UdpOutDatagrams 27 0.0 Ip6OutNoRoutes 1 0.0 TcpExtDelayedACKs 1 0.0 TcpExtTCPPrequeued 98 0.0 TcpExtTCPDirectCopyFromPrequeue 98 0.0 TcpExtTCPHPHits 4 0.0 TcpExtTCPHPHitsToUser 98 0.0 TcpExtTCPPureAcks 5 0.0 TcpExtTCPHPAcks 101 0.0 TcpExtTCPAbortOnData 6 0.0 TcpExtBusyPollRxPackets 400832 0.0 TcpExtTCPOrigDataSent 400983 0.0 IpExtInOctets 21273867 0.0 IpExtOutOctets 21261254 0.0 IpExtInNoECTPkts 401064 0.0 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:40 -05:00
Mel Gorman	d0164adc89	mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-06 17:50:42 -08:00
Yuval Mintz	da3cc2da7c	bnx2: Fix bandwidth allocation for some MF modes Management firmware tells driver in case bandwidth configuration for a specific function exists, but [regretably] the same field has different meanings depending on the multi-function mode - it can either be a percentile value or an actual speed. For newer multi-function modes current logic is incorrect - driver understands values as actual speeds instead of percentages, causing the resulting chip configuration to be incorrect. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-17 10:27:57 -07:00
David S. Miller	182ad468e7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/cavium/Kconfig The cavium conflict was overlapping dependency changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-13 16:23:11 -07:00
Yuval Mintz	e1615903eb	bnx2x: Prevent null pointer dereference on SKB release On error flows its possible to free an SKB even if it was not allocated. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-10 14:31:58 -07:00
Yuval Mintz	05cc5a39dd	bnx2x: add vlan filtering offload Current driver always uses vlan-promisc mode, i.e., it receives both tagged and untagged traffic and lets the network stack drop packets tagged with unrequested vlan tags. This patch implements vlan-filtering offload in the driver - Unless explicitly configured to promisc mode, only untagged packets or packets tagged with requested vlans would reach the Rx flow. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-29 23:24:45 -07:00
Yuval Mintz	c48f350ff5	bnx2x: Add MFW dump support Devices with up-to-date management FW will be able to store register dumps on their persistent storage - in case management FW identifies a fatal error it would gather and store such dumps, which could later be retrieved using specific debug tools. This patch adds the necessary part in the driver in order to make the feature operational, as well as update users [under debug] during load in case their device contains a dump of a previous crash. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-22 10:47:27 -07:00
Yuval Mintz	230d00eb4b	bnx2x: new Multi-function mode - BD This adds support to a new multi-function mode, enabling driver to initialize such devices and correctly interacting with management FW for fully utilizing their features. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-22 10:47:26 -07:00
Yuval Mintz	4ad79e1301	bnx2x: Rebrand from 'broadcom' into 'qlogic' bnx2x still appears as a Broadcom driver even though the devices it utilizes belong to Qlogic for more than a year. This patch changes the various headers and the device strings to indicate the correct ownership of the device. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-22 10:47:26 -07:00

1 2 3 4 5 ...

262 Commits