linux

Commit Graph

Author	SHA1	Message	Date
Jacob Keller	0593186a17	fm10k: implement reset_notify handler for PCIe FLR events When a function level PCI reset is triggered using sysfs, it calls the driver's .reset_notify error handler. Implement a handler based on the now split fm10k_prepare_for_reset and fm10k_handle_reset functions, so that we fully reset the driver when the PCI function level reset occurs. This also ensures the reset is handled in a clean way by first disabling all the driver bits first and then restoring them after the function reset. Previously the stack simply performed a blind function reset and our driver didn't take any part in the process. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:14 -07:00
Jacob Keller	820c91aa9c	fm10k: use common reset flow when handling io errors from PCI stack Now that we have extracted the necessary steps for a split suspend/resume flow, re-use these functions instead of using the current open coded flow. This ensures that we don't miss any steps. It also ensures that we have the correct driver states set. Since we'll be handling all of the reset flow ourselves, we no longer need to request a reset in the io_slot_reset() function. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:13 -07:00
Jacob Keller	dc4b76c0fe	fm10k: implement prepare_suspend and handle_resume Implement fm10k_prepare_suspend and fm10k_handle_resume functions which abstract around the now existing fm10k_prepare_for_reset and fm10k_handle_reset. The new functions also handle stopping the service task, which is something that the original re-init flow does not need. Every other location that does a suspend/resume type flow is expected to use these functions, because otherwise they may have conflicts with the running watchdog routines. This also has the effect of preventing possible surprise remove events during handling of FLR events and PCIe errors. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:13 -07:00
Jacob Keller	40de1fad41	fm10k: split fm10k_reinit into two functions There are several flows in the driver which perform the similar function of tearing down software and restoring software to recover from certain errors or PCIe events, including: * fm10k_reinit * fm10k_suspend/resume * fm10k_io_error_detected/fm10k_io_resume In addition, we want to implement a .reset_notify() handler as well which will also perform similar function. Rework how the driver codes reset and resume flows by separating out the reinit logic into two functions "fm10k_prepare_for_reset" and "fm10k_handle_reset". This first step will allow us to re-use this functionality in the similar blocks of code instead of re-coding the same sequence of events slightly different. The end result should be more maintainable and correct, fixing several inconsistencies with the work flow. The new functions expect to take the rtnl_lock() themselves, and it does have the unfortunate side effect of having the reinit flow take then release then take the rtnl_lock. However, this minor downside is out weighted by the benefits of code reduction and reducing needless difference between these flows. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:13 -07:00
Jacob Keller	94877768cf	fm10k: wait for queues to drain if stop_hw() fails once It turns out that sometimes during a reset the Tx queues will be temporarily stuck longer than .stop_hw() expects. Work around this issue by attempting to .stop_hw() first. If it tails, wait a number of attempts until the Tx queues appear to be drained. After this, attempt stop_hw() again. This ensures that we avoid waiting if we don't need to, such as during the first initialization of a VF, and give the proper amount of time necessary to recover from most situations. It is possible that the hardware is actually stuck. For PFs, this is usually fixed by a datapath reset. Unfortunately the VF cannot request a similar reset for itself. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:12 -07:00
Jacob Keller	106ca42356	fm10k: only warn when stop_hw fails with FM10K_ERR_REQUESTS_PENDING When stop_hw() routine fails with FM10K_ERR_REQUESTS_PENDING, this indicates that the Tx or Rx queues did not shutdown within the time limit. Print a more suitable message at the dev_info level instead of dev_err. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:12 -07:00
Jacob Keller	9d73edee59	fm10k: prevent multiple threads updating statistics Also prevent updating stats while the interface is down. If we're already updating stats, just return doing nothing. When we take the device down, block stat updates until we come back up. This ensures that we avoid tearing down rings when we're updating statistics, and prevents updating statistics until we're up. We can't re-use the __FM10K_DOWN for this because it wouldn't prevent multiple threads from accessing statistics. Neither does it prevent the case where we start updating stats and then start going down in another thread. The fm10k_get_stats64 is except from this, because it has a completely different flow which does not suffer from the same issues as fm10k_update_stats might. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:10 -07:00
Jacob Keller	b624714bc9	fm10k: avoid possible null pointer dereference in fm10k_update_stats It's currently possible for fm10k_update_stats to be called during the window when we go down and the rings are removed. This can result in a null pointer dereference. In fm10k_get_stats64 we work around this by using ACCESS_ONCE and a null pointer check inside the loop. Use this same flow in the fm10k_update_stats to avoid the potential null pointer. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:09 -07:00
Jacob Keller	1b00c6c064	fm10k: no need to continue in fm10k_down if __FM10K_DOWN already set Return early from fm10k_down() when we are already down, since that means another thread is either already finished or has started going down, so shouldn't conflict with them. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:09 -07:00
Florian Westphal	860e9538a9	treewide: replace dev->trans_start update with helper Replace all trans_start updates with netif_trans_update helper. change was done via spatch: struct net_device *d; @@ - d->trans_start = jiffies + netif_trans_update(d) Compile tested only. Cc: user-mode-linux-devel@lists.sourceforge.net Cc: linux-xtensa@linux-xtensa.org Cc: linux1394-devel@lists.sourceforge.net Cc: linux-rdma@vger.kernel.org Cc: netdev@vger.kernel.org Cc: MPT-FusionLinux.pdl@broadcom.com Cc: linux-scsi@vger.kernel.org Cc: linux-can@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: linux-omap@vger.kernel.org Cc: linux-hams@vger.kernel.org Cc: linux-usb@vger.kernel.org Cc: linux-wireless@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: devel@driverdev.osuosl.org Cc: b.a.t.m.a.n@lists.open-mesh.org Cc: linux-bluetooth@vger.kernel.org Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Acked-by: Mugunthan V N <mugunthanvnm@ti.com> Acked-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-04 14:16:49 -04:00
Hannes Frederic Sowa	41419b9303	fm10k: protect fm10k_open in fm10k_io_resume with rtnl_lock fm10k_open requires rtnl_lock to be held. Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Shannon Nelson <shannon.nelson@intel.com> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: John Ronciak <john.ronciak@intel.com> Cc: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-21 15:35:43 -04:00
Jacob Keller	8664109467	fm10k: consistently use Intel(R) for driver names Update every header file and other locations to consistently use Intel(R) instead of just Intel. Also update copyright year of files which we modified. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-20 23:07:34 -07:00
Jacob Keller	3417415c3a	fm10k: do not disable PCI device in fm10k_io_error_detected fm10k_io_error_detected() does not need to call pci_disable_device(). In the cases where the reset needs to occur, the stack flow will result in calling fm10k_remove() which already disables the PCI device. If we leave the pci_disable_device(), we result in a warning about disabling an already disabled device. Many PCI drivers do call pci_disable_device() in their .error_detected() routines, but it does not appear to be required. In addition, these drivers have a check "is_pci_enabled()" call in their remove routines, which is how they chose to handle the duplicate device disable. This seems incorrect, since the PCI device structure is reference counted. It is very possible that the reference count for the PCI device could be greater than 1. In this case, you would remove the PCI device within the error_detected routine, reducing count to 1, then remove it again in the remove function, reducing it to zero. This would result in yet another disable somewhere else failing. Thus, we shouldn't be using is_pci_enabled() to check for this issue. Instead, just remove the extraneous pci_device_disable() found within the error_detected routine. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-20 23:06:23 -07:00
Jacob Keller	a7a7783ada	fm10k: correctly handle LPORT_MAP error Currently, any error responses from the switch manager after an LPORT_MAP request are silently ignored. At most the mailbox message will be reported as an error. This can result in unexpected behavior when the switch manager has configured a port with zero bandwidth. Add support for reading the fm10k_swapi_error structure from LPORT_MAP responses. If the message contains the necessary TLV and has a non-zero error code, report link down, clear the dglort_map, and delay the next get_host_state call by a reasonable delay. Also log an error message indicating that the LPORT_MAP request failed. The delay ensures preventing an interrupt storm on the switch manager, and reduces the number of mailbox messages we send in this scenario drastically. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-20 23:06:22 -07:00
Jacob Keller	9de6a1a6b8	fm10k: drop 1588 support The 1588 support within fm10k does not work correctly with the current version of the switch management software, and likely never worked correctly to begin with. Remove support for PTP/1588. Update copyright year for all these files while we're touching them. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-20 23:06:22 -07:00
Jacob Keller	1e4c32f3ed	fm10k: prevent RCU issues during AER events During an AER action response, we were calling fm10k_close without holding the rtnl_lock() which could lead to possible RCU warnings being produced due to 64bit stat updates among other causes. Similarly, we need rtnl_lock() around fm10k_open during fm10k_io_resume. Follow the same pattern elsewhere in the driver and protect the entire open/close sequence. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-20 23:06:21 -07:00
Jacob Keller	d8ec92f2cd	fm10k: fix a minor typo in some comments s/funciton/function to resolve a typo, and cleanup grammar on a few comments regarding processing the VF mailboxes. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:38 -07:00
Jacob Keller	c8ed563beb	fm10k: free MBX IRQ before clearing interrupt scheme During fm10k_io_error_detected we were clearing the interrupt scheme before we freed the MBX IRQ. This causes a kernel panic because the MBX IRQ are assigned after MSI-X initialization. Clearing the interrupt scheme results in removing the MSI-X entry table. Fix this by freeing the MBX IRQ before we clear the interrupt scheme, as we do elsewhere in the driver. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:22 -07:00
Jacob Keller	61e0217e83	fm10k: print error message when stop_hw fails fm10k_stop_hw_generic calls fm10k_disable_queues_generic, which may return an error code indicating that the queues were not stopped within the time limit. Notify the user by displaying a message in the kernel message ring, in a similar way to how we notify the user when reset_hw fails. There isn't much we can do to recover from this error, so currently nothing else is done. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:14 -07:00
Jacob Keller	e72319bba8	fm10k: don't initialize service task until later in probe Delay initialization of the service timer and service task until late probe. If we don't wait, failures in probe do not properly cleanup the service timer or service task items, which results in the kernel panic below, potentially freezing the whole system. In addition, ensure that the SERVICE_DISABLE bit is set before we request the MBX IRQ since the MBX interrupt attempts to schedule the service task otherwise. This prevents a similar trace from occurring after this change. We didn't notice this issue before because probe almost always completes successfully. I discovered it due to a mis-ordered mailbox handler array, which resulted in the following failure when requesting mailbox interrupt. [ 555.325619] ------------[ cut here ]------------ [ 555.325628] WARNING: CPU: 0 PID: 4941 at lib/list_debug.c:33 __list_add+0xa0/0xd0() [ 555.325631] list_add corruption. prev->next should be next (ffffffff81f46648), but was (null). (prev=ffff8807fad5d0e8). <snip> [ 555.325722] CPU: 0 PID: 4941 Comm: insmod Tainted: G OE 4.0.4-303.fc22.x86_64 #1 [ 555.325725] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.8x23.060520140825 06/05/2014 [ 555.325727] 0000000000000000 00000000b4f161b3 ffff88081a21f8e8 ffffffff81783124 [ 555.325734] 0000000000000000 ffff88081a21f940 ffff88081a21f928 ffffffff8109c66a [ 555.325740] 0000000064000000 ffff8807fad5d0e8 ffff8807fad5d0e8 ffffffff81f46648 [ 555.325746] Call Trace: [ 555.325752] [<ffffffff81783124>] dump_stack+0x45/0x57 [ 555.325757] [<ffffffff8109c66a>] warn_slowpath_common+0x8a/0xc0 [ 555.325759] [<ffffffff8109c6f5>] warn_slowpath_fmt+0x55/0x70 [ 555.325763] [<ffffffff813ba270>] __list_add+0xa0/0xd0 [ 555.325768] [<ffffffff81102d1d>] __internal_add_timer+0x9d/0x110 [ 555.325771] [<ffffffff81102dbf>] internal_add_timer+0x2f/0xc0 [ 555.325774] [<ffffffff81104e5a>] mod_timer+0x12a/0x230 [ 555.325782] [<ffffffffa03d54ca>] fm10k_probe+0x69a/0xc80 [fm10k] [ 555.325787] [<ffffffff813e8355>] local_pci_probe+0x45/0xa0 [ 555.325791] [<ffffffff8129cf42>] ? sysfs_do_create_link_sd.isra.2+0x72/0xc0 [ 555.325794] [<ffffffff813e96b9>] pci_device_probe+0xf9/0x150 [ 555.325799] [<ffffffff814d7e73>] driver_probe_device+0xa3/0x400 [ 555.325802] [<ffffffff814d82ab>] __driver_attach+0x9b/0xa0 [ 555.325805] [<ffffffff814d8210>] ? __device_attach+0x40/0x40 [ 555.325808] [<ffffffff814d5bd3>] bus_for_each_dev+0x73/0xc0 [ 555.325811] [<ffffffff814d78ce>] driver_attach+0x1e/0x20 [ 555.325815] [<ffffffff814d7480>] bus_add_driver+0x180/0x250 [ 555.325819] [<ffffffffa03b2000>] ? 0xffffffffa03b2000 [ 555.325823] [<ffffffff814d8aa4>] driver_register+0x64/0xf0 [ 555.325826] [<ffffffff813e7bec>] __pci_register_driver+0x4c/0x50 [ 555.325832] [<ffffffffa03d6ca3>] fm10k_register_pci_driver+0x23/0x30 [fm10k] [ 555.325838] [<ffffffffa03b2080>] fm10k_init_module+0x80/0x1000 [fm10k] [ 555.325843] [<ffffffff81002128>] do_one_initcall+0xb8/0x200 [ 555.325848] [<ffffffff811e10d2>] ? __vunmap+0xa2/0x100 [ 555.325852] [<ffffffff811fe239>] ? kmem_cache_alloc_trace+0x1b9/0x240 [ 555.325855] [<ffffffff8178230e>] ? do_init_module+0x28/0x1cb [ 555.325858] [<ffffffff81782346>] do_init_module+0x60/0x1cb [ 555.325862] [<ffffffff8112168e>] load_module+0x205e/0x26b0 [ 555.325866] [<ffffffff8111d110>] ? store_uevent+0x70/0x70 [ 555.325870] [<ffffffff812234b0>] ? kernel_read+0x50/0x80 [ 555.325873] [<ffffffff81121f3e>] SyS_finit_module+0xbe/0xf0 [ 555.325878] [<ffffffff81789749>] system_call_fastpath+0x12/0x17 [ 555.325880] ---[ end trace 9e0f58d071eafd2a ]--- Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:02 -07:00
Jacob Keller	de66c610a6	fm10k: prevent null pointer dereference of msix_entries table According to the C standard dereferencing a variable before it is checked invokes undefined behavior, and thus compilers are free to assume the check for NULL isn't necessary. Prevent this by re-ordering the NULL check of msix_entries in fm10k_free_mbx_irq. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:48:55 -07:00
Bruce Allan	11c49f79b2	fm10k: use ether_addr_copy to copy MAC address Cleanup the remaining instances of using memcpy() instead of the preferred ether_addr_copy(). Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:44:49 -07:00
Bruce Allan	838e610292	fm10k: demote BUG_ON() to WARN_ON() where appropriate We don't need to crash the kernel in this instance so just warn about the condition and play on. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:44:36 -07:00
Bruce Allan	fcdb0a9951	fm10k: cleanup remaining right-bit-shifted 1 Use BIT() macro instead. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:44:30 -07:00
Bruce Allan	1aab144c50	fm10k: Move constants to the right of binary operators The semantic patch that makes this change is available in scripts/coccinelle/misc/compare_const_fl.cocci. More information about semantic patching is available at http://coccinelle.lip6.fr/ Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:39:55 -07:00
Bruce Allan	f355bb5179	fm10k: use true/false for boolean get_host_state Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-22 05:16:45 -08:00
Jacob Keller	6186ddf06d	fm10k: use ether_addr_equal instead of memcmp When comparing MAC addresses, use ether_addr_equal instead of memcmp to ETH_ALEN length. Found and replaced using the following sed: sed -e 's/memcmp\x28\(.*\), ETH_ALEN\x29/!ether_addr_equal\x28\1\x29/' Reported-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-22 05:01:58 -08:00
Alexander Duyck	09f8a82b6a	fm10k: Cleanup exception handling for changing queues This patch is meant to cleanup the exception handling for the paths where we reset the interrupts and then reconfigure them. In all of these paths we had very different levels of exception handling. I have updated the driver so that all of the paths should result in a similar state if we fail. Specifically the driver will now unload the mailbox interrupt, free the queue vectors and MSI-X, and then detach the interface. In addition for any of the PCIe related resets I have added a check with the hw_ready function to just make sure the registers are in a readable state prior to reopening the interface. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-22 04:50:11 -08:00
Jacob Keller	504b0fdf92	fm10k: initialize xps at driver load Similar to ixgbe and i40e, initialize XPS on driver load so that we can take advantage of this kernel feature. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-13 15:28:43 -08:00
Bruce Allan	3d02b3df73	fm10k: cleanup overly long lines Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-13 15:28:38 -08:00
Bruce Allan	a4fcad656e	fm10k: whitespace cleanups Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-13 15:28:21 -08:00
Alexander Duyck	e00e23bceb	fm10k: Cleanup exception handling for mailbox interrupt This patch addresses two issues. First is the fact that the fm10k_mbx_free_irq was assuming msix_entries was valid and that will not always be the case. As such we need to add a check for if it is NULL. Second is the fact that we weren't freeing the IRQ if the mailbox API returned an error on trying to connect. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-13 15:28:10 -08:00
Jacob Keller	aa502b4a24	fm10k: consistently refer to VLANs and VLAN IDs Instead of using lowercase vlan, vid, or VID, always use VLAN or VLAN ID in comments when referring to VLANs. The original driver code was consistent, but recent patches have not been as consistent with this naming scheme. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-13 15:27:48 -08:00
Jacob Keller	40423dd2a5	fm10k: do not use CamelCase Avoid the use of CamelCase for some variable names that previously slipped through review. Reported-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-13 15:27:31 -08:00
Jacob Keller	c7bc952349	fm10k: TRIVIAL fix typo of hardware Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-05 23:55:24 -08:00
Jacob Keller	436ea956bf	fm10k: use macro for default Tx and Rx ITR values Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-05 23:55:22 -08:00
Jacob Keller	242722dd3d	fm10k: Update adaptive ITR algorithm The existing adaptive ITR algorithm is overly restrictive. It throttles incorrectly for various traffic rates, and does not produce good performance. The algorithm now allows for more interrupts per second, and does some calculation to help improve for smaller packet loads. In addition, take into account the new itr_scale from the hardware which indicates how much to scale due to PCIe link speed. Reported-by: Matthew Vick <matthew.vick@intel.com> Reported-by: Alex Duyck <alexander.duyck@gmail.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-05 23:55:21 -08:00
Jacob Keller	875328e4bc	fm10k: reinitialize queuing scheme after calling init_hw The init_hw function may fail, and in the case of VFs, it might change the number of maximum queues available. Thus, for every flow which checks init_hw, we need to ensure that we clear the queue scheme before, and initialize it after. The fm10k_io_slot_reset path will end up triggering a reset so fm10k_reinit needs this change. The fm10k_io_error_detected and fm10k_io_resume also need to properly clear and reinitialize the queue scheme. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-05 23:55:16 -08:00
Jacob Keller	1343c65f70	fm10k: always check init_hw for errors A recent change modified init_hw in some flows the function may fail on VF devices. For example, if a VF doesn't yet own its own queues. However, many callers of init_hw didn't bother to check the error code. Other callers checked but only displayed diagnostic messages without actually handling the consequences. Fix this by (a) always returning and preventing the netdevice from going up, and (b) printing the diagnostic in every flow for consistency. This should resolve an issue where VF drivers would attempt to come up before the PF has finished assigning queues. In addition, change the dmesg output to explicitly show the actual function that failed, instead of combining reset_hw and init_hw into a single check, to help for future debugging. Fixes: 1d568b0f6424 ("fm10k: do not assume VF always has 1 queue") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-05 23:55:15 -08:00
Jacob Keller	e0244903d4	fm10k: set netdev features in one location Don't change netdev hw_features later in fm10k_probe, instead set all values inside fm10k_alloc_netdev. To do so, we need to know the MAC type (whether it is PF or VF) in order to determine what to do. This helps ensure that all logic regarding features is co-located. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-05 23:55:13 -08:00
Alexander Duyck	de125aaecf	fm10k: use napi_schedule_irqoff() The fm10k_msix_clean_rings function runs from hard interrupt context or with interrupts already disabled in netpoll. It can use napi_schedule_irqoff() instead of napi_schedule() Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-11-25 10:05:54 -08:00
Jacob Keller	80043f3bf5	fm10k: add support for extra debug statistics Add a private ethtool flag to enable display of these statistics, which are generally less useful. However, sometimes it can be useful for debugging purposes. The most useful portion is the ability to see what the PF thinks the VF mailboxes look like. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-09-22 15:58:27 -07:00
Jacob Keller	8427672abd	fm10k: remove comment about rtnl_lock around mbx operations This comment is no longer true due to a couple of mailbox locking refactors, and we now don't actually do any rtnl protected operations directly in the mailbox path. Remove this comment as it is factually incorrect and confusing. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-09-22 15:58:26 -07:00
Jacob Keller	95f4f8da64	fm10k: re-enable VF after a full reset on detection of a Malicious event Modify behavior of Malicious Driver Detection events. Presently, the hardware disables the VF queues and re-assigns them to the PF. This causes the VF in question to continuously Tx hang, because it assumes that it can transmit over the queues in question. For transient events, this results in continuous logging of malicious events. New behavior is to reset the LPORT and VF state, so that the VF will have to reset and re-enable itself. This does mean that malicious VFs will possibly be able to continue and attempt malicious events again. However, it is expected that system administrators will step in and manually remove or disable the VF in question. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-09-15 17:06:14 -07:00
Jacob Keller	e71c931842	fm10k: send traffic on default VID to VLAN device if we have one This patch ensures that VLAN traffic on the default VID will go to the corresponding VLAN device if it exists. To do this, mask the rx_ring VID if we have an active VLAN on that VID. For this to work correctly, we need to update fm10k_process_skb_fields to correctly mask off the VLAN_PRIO_MASK bits and compare them separately, otherwise we incorrectly compare the priority bits with the cleared flag. This also happens to fix a related bug where having priority bits set causes us to incorrectly classify traffic. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-09-15 17:06:04 -07:00
Alexander Duyck	0ff36676a3	fm10k: Report MAC address on driver load This change adds the MAC address to the list of values recorded on driver load. The MAC address represents the serial number of the unit and allows us to track the value should a card be replaced in a system. The log message should now be similar in output to that of ixgbe. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-09-15 17:05:41 -07:00
Jacob Keller	bdc7f5902d	fm10k: update netdev perm_addr during reinit, instead of at up Update the netdev permanent address during fm10k_reinit enables the user to immediately see the new MAC address on the VF even if the device isn't up. The previous code required that the device by opened before changes would appear. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-09-15 17:05:32 -07:00
Jacob Keller	106c07a495	fm10k: update fm10k_slot_warn to use pcie_get_minimum link This is useful in cases where we connect to a slot at Gen3, but the slot is behind a bus which only connected at Gen2. This generally only happens when a PCIe switch is in the sequence of devices, and can be very confusing when you see slow performance with no obvious cause. I am aware this patch has a few lines that break 80 characters, but there does not seem to be a readable way to format them to less than 80 characters. Suggestions welcome. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-09-15 17:05:28 -07:00
Jacob Keller	e40296628b	fm10k: disable service task during suspend The service task reads some registers as part of its normal routine, even while the interface is down. Normally this is ok. However, during suspend we have disabled the PCI device. Due to this, registers will read in the same way as a surprise-remove event. Disable the service task while we suspend, and re-enable it after we resume. If we don't do this, the device could be UP when you suspend and come back from resume as closed (since fm10k closes the device when it gets a surprise remove). Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-09-15 17:05:18 -07:00
Jacob Keller	c04ae58e2b	fm10k: use dma_set_mask_and_coherent in fm10k_probe This patch cleans up the use of dma_get_required_mask and uses the simpler dma_set_mask_and_coherent function instead of doing these as separate steps. I removed the dma_get_required_mask call because based on some minimal testing it appears that either (a) we're not doing the right thing with the call or (b) we don't need it anyways. If the value returned is <48bits, we'll end up trying with 48 bits anyways. If it's over 48bits, fm10k can't support that anyways, and we should try 48bits. If 48bits fails, we'll fallback to 32bits. This cleans up some very funky code. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-06-17 14:21:10 -07:00

1 2

79 Commits