linux

Commit Graph

Author	SHA1	Message	Date
Mika Westerberg	52be9464aa	PCI/portdrv: Resume upon exit from system suspend if left runtime suspended Currently we try to keep PCIe ports runtime suspended over system suspend if possible. This mostly happens when entering suspend-to-idle because there is no need to re-configure wake settings. This causes problems if the parent port goes into D3cold and it gets resumed upon exit from system suspend. This may happen for example if the port is part of PCIe switch and the same switch is connected to a PCIe endpoint that needs to be resumed. The way exit from D3cold works according PCIe 4.0 spec 5.3.1.4.2 is that power is restored and cold reset is signaled. After this the device is in D0unitialized state keeping PME context if it supports wake from D3cold. The problem occurs when a PCIe hotplug port is left suspended and the parent port goes into D3cold and back to D0: the port keeps its PME context but since everything else is reset back to defaults (D0unitialized) it is not set to detect hotplug events anymore. For this reason change the PCIe portdrv power management logic so that it is fine to keep the port runtime suspended over system suspend but it needs to be resumed upon exit to make sure it gets properly re-initialized. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-10-02 16:04:40 -05:00
Keith Busch	f0157160b3	PCI: Make link active reporting detection generic The spec has timing requirements when waiting for a link to become active after a conventional reset. Implement those hard delays when waiting for an active link so pciehp and dpc drivers don't need to duplicate this. For devices that don't support data link layer active reporting, wait the fixed time recommended by the PCIe spec. Signed-off-by: Keith Busch <keith.busch@intel.com> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>	2018-10-02 16:04:40 -05:00
Keith Busch	a6bd101b8f	PCI: Unify device inaccessible Bring surprise removals and permanent failures together so we no longer need separate flags. The implementation enforces that error handling will not be able to override a surprise removal's permanent channel failure. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>	2018-10-02 16:04:40 -05:00
Keith Busch	7b42d97e99	PCI/ERR: Always report current recovery status for udev A device still participates in error recovery even if it doesn't have the error callbacks. Always provide the status for user event watchers. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>	2018-10-02 16:04:40 -05:00
Keith Busch	542aeb9c8f	PCI/ERR: Simplify broadcast callouts There is no point in having a generic broadcast function if it needs to have special cases for each callback it broadcasts. Abstract the error broadcast to only the necessary information and removes the now unnecessary helper to walk the bus. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>	2018-10-02 16:04:40 -05:00
Keith Busch	bfcb79fca1	PCI/ERR: Run error recovery callbacks for all affected devices If an Endpoint reported an error with ERR_FATAL, we previously ran driver error recovery callbacks only for the Endpoint's driver. But if we reset a Link to recover from the error, all downstream components are affected, including the Endpoint, any multi-function peers, and children of those peers. Initiate the Link reset from the deepest Downstream Port that is reliable, and call the error recovery callbacks for all its children. If a Downstream Port (including a Root Port) reports an error, we assume the Port itself is reliable and we need to reset its downstream Link. In all other cases (Switch Upstream Ports, Endpoints, Bridges, etc), we assume the Link leading to the component needs to be reset, so we initiate the reset at the parent Downstream Port. This allows two other clean-ups. First, we currently only use a Link reset, which can only be initiated using a Downstream Port, so we can remove checks for Endpoints. Second, the Downstream Port where we initiate the Link reset is reliable (unlike components downstream from it), so the special cases for error detect and resume are no longer necessary. Signed-off-by: Keith Busch <keith.busch@intel.com> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>	2018-09-26 14:23:15 -05:00
Keith Busch	bdb5ac8577	PCI/ERR: Handle fatal error recovery We don't need to be paranoid about the topology changing while handling an error. If the device has changed in a hotplug capable slot, we can rely on the presence detection handling to react to a changing topology. Restore the fatal error handling behavior that existed before merging DPC with AER with `7e9084b367` ("PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices"). Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>	2018-09-26 14:23:14 -05:00
Keith Busch	c4eed62a21	PCI/ERR: Use slot reset if available The secondary bus reset may have link side effects that a hotplug capable port may incorrectly react to. Use the slot specific reset for hotplug ports, fixing the undesirable link down-up handling during error recovering. Signed-off-by: Keith Busch <keith.busch@intel.com> [bhelgaas: fold in https://lore.kernel.org/linux-pci/20180926152326.14821-1-keith.busch@intel.com for issue reported by Stephen Rothwell <sfr@canb.auug.org.au>] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>	2018-09-21 12:18:10 -05:00
Keith Busch	9d938ea53b	PCI/AER: Don't read upstream ports below fatal errors The AER driver has never read the config space of an endpoint that reported a fatal error because the link to that device is considered unreliable. An ERR_FATAL from an upstream port almost certainly indicates an error on its upstream link, so we can't expect to reliably read its config space for the same reason. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>	2018-09-21 12:18:09 -05:00
Keith Busch	60271ab044	PCI/AER: Take reference on error devices Error handling may be running in parallel with a hot removal. Reference count the device during AER handling so the device can not be freed while AER wants to reference it. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>	2018-09-21 12:18:08 -05:00
Keith Busch	4f802170a8	PCI/DPC: Save and restore config state This patch provides DPC save and restore capabilities. This is necessary for the driver to observe DPC events in the event the configuration space needs to be restored after a reset. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>	2018-09-20 16:06:27 -05:00
Keith Busch	874b325111	PCI: portdrv: Restore PCI config state on slot reset The port's config space may be cleared after a link reset, which wipes out the bridge's bus and memory windows. Restore the config space that was saved during probe so we can access downstream devices. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>	2018-09-20 16:06:18 -05:00
Keith Busch	c29de84149	PCI: portdrv: Initialize service drivers directly The PCI port driver saves the PCI state after initializing the device with the applicable service devices. This was, however, before the service drivers were even registered because PCI probe happens before the device_initcall initialized those service drivers. The config space state that the services set up were not being saved. The end result would cause PCI devices to not react to events that the drivers think they did if the PCI state ever needed to be restored. Fix this by changing the service drivers from using the init calls to having the portdrv driver calling the services directly. This will get the state saved as desired, while making the relationship between the port driver and the services under it more explicit in the code. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>	2018-09-20 12:05:54 -05:00
Patrick Talbert	17c9148736	PCI/ASPM: Do not initialize link state when aspm_disabled is set Now that ASPM is configured for all PCIe devices at boot, a problem is seen with systems that set the FADT NO_ASPM bit. This bit indicates that the OS should not alter the ASPM state, but when pcie_aspm_init_link_state() runs it only checks for !aspm_support_enabled. This misses the ACPI_FADT_NO_ASPM case because that is setting aspm_disabled. The result is systems may hang at boot after 1302fcf; avoidable if they boot with pcie_aspm=off (sets !aspm_support_enabled). Fix this by having aspm_init_link_state() check for either !aspm_support_enabled or acpm_disabled. Link: https://bugzilla.kernel.org/show_bug.cgi?id=201001 Fixes: `1302fcf0d0` ("PCI: Configure all devices, not just hot-added ones") Signed-off-by: Patrick Talbert <ptalbert@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-09-18 16:19:06 -05:00
Lukas Wunner	a50ac6bfd6	PCI: Simplify disconnected marking Commit `89ee9f7680` ("PCI: Add device disconnected state") iterates over the devices on a parent bus, marks each as disconnected, then marks each device's children as disconnected using pci_walk_bus(). The same can be achieved more succinctly by calling pci_walk_bus() on the parent bus. Moreover, this does not need to wait until acquiring pci_lock_rescan_remove(), so move it out of that critical section. The critical section in err.c contains a pci_dev_get() / pci_dev_put() pair which was apparently copy-pasted from pciehp_pci.c. In the latter it serves the purpose of holding the struct pci_dev in place until the Command register is updated. err.c doesn't do anything like that, hence the pair is unnecessary. Remove it. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Oza Pawandeep <poza@codeaurora.org> Cc: Sinan Kaya <okaya@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2018-09-17 16:34:35 -05:00
Lukas Wunner	aeae4f3e5c	PCI/ASPM: Fix link_state teardown on device removal Upon removal of the last device on a bus, the link_state of the bridge leading to that bus is sought to be torn down by having pci_stop_dev() call pcie_aspm_exit_link_state(). When ASPM was originally introduced by commit `7d715a6c1a` ("PCI: add PCI Express ASPM support"), it determined whether the device being removed is the last one by calling list_empty() on the bridge's subordinate devices list. That didn't work because the device is only removed from the list slightly later in pci_destroy_dev(). Commit `3419c75e15` ("PCI: properly clean up ASPM link state on device remove") attempted to fix it by calling list_is_last(), but that's not correct either because it checks whether the device is at the end of the list, not whether it's the last one left in the list. If the user removes the device which happens to be at the end of the list via sysfs but other devices are preceding the device in the list, the link_state is torn down prematurely. The real fix is to move the invocation of pcie_aspm_exit_link_state() to pci_destroy_dev() and reinstate the call to list_empty(). Remove a duplicate check for dev->bus->self because pcie_aspm_exit_link_state() already contains an identical check. Fixes: `7d715a6c1a` ("PCI: add PCI Express ASPM support") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Shaohua Li <shaohua.li@intel.com> Cc: stable@vger.kernel.org # v2.6.26	2018-09-17 16:32:23 -05:00
Bjorn Helgaas	3a48dc6fc2	Merge branch 'pci/virtualization' - To avoid bus errors, enable PASID only if entire path supports End-End TLP prefixes (Sinan Kaya) - Unify slot and bus reset functions and remove hotplug knowledge from callers (Sinan Kaya) - Add Function-Level Reset quirks for Intel and Samsung NVMe devices to fix guest reboot issues (Alex Williamson) - Add function 1 DMA alias quirk for Marvell 88SS9183 PCIe SSD Controller (Bjorn Helgaas) * pci/virtualization: PCI: Add function 1 DMA alias quirk for Marvell 88SS9183 PCI: Delay after FLR of Intel DC P3700 NVMe PCI: Disable Samsung SM961/PM961 NVMe before FLR PCI: Export pcie_has_flr() PCI: Rename pci_try_reset_bus() to pci_reset_bus() PCI: Deprecate pci_reset_bus() and pci_reset_slot() functions PCI: Unify try slot and bus reset API PCI: Hide pci_reset_bridge_secondary_bus() from drivers IB/hfi1: Use pci_try_reset_bus() for initiating PCI Secondary Bus Reset PCI: Handle error return from pci_reset_bridge_secondary_bus() PCI/IOV: Tidy pci_sriov_set_totalvfs() PCI: Enable PASID only if entire path supports End-End TLP prefixes # Conflicts: # drivers/pci/hotplug/pciehp_hpc.c	2018-08-15 14:59:06 -05:00
Bjorn Helgaas	c0638a4553	Merge branch 'pci/hotplug' - Simplify SHPC existence/permission checks (Bjorn Helgaas) - Remove hotplug sample skeleton driver (Lukas Wunner) - Convert pciehp to threaded IRQ handling (Lukas Wunner) - Improve pciehp tolerance of missed events and initially unstable links (Lukas Wunner) - Clear spurious pciehp events on resume (Lukas Wunner) - Add pciehp runtime PM support, including for Thunderbolt controllers (Lukas Wunner) - Support interrupts from pciehp bridges in D3hot (Lukas Wunner) * pci/hotplug: PCI: pciehp: Deduplicate presence check on probe & resume PCI: pciehp: Avoid implicit fallthroughs in switch statements PCI: Whitelist Thunderbolt ports for runtime D3 PCI: Whitelist native hotplug ports for runtime D3 PCI: sysfs: Resume to D0 on function reset PCI: pciehp: Resume parent to D0 on config space access PCI: pciehp: Resume to D0 on enable/disable PCI: pciehp: Support interrupts sent from D3hot PCI: pciehp: Obey compulsory command delay after resume PCI: pciehp: Clear spurious events earlier on resume PCI: portdrv: Deduplicate PM callback iterator PCI: pciehp: Avoid slot access during reset PCI: pciehp: Always enable occupied slot on probe PCI: pciehp: Become resilient to missed events PCI: pciehp: Tolerate initially unstable link PCI: pciehp: Declare pciehp_enable/disable_slot() static PCI: pciehp: Drop enable/disable lock PCI: pciehp: Enable/disable exclusively from IRQ thread PCI: pciehp: Track enable/disable status PCI: pciehp: Publish to user space last on probe PCI: hotplug: Demidlayer registration with the core PCI: pciehp: Drop slot workqueue PCI: pciehp: Handle events synchronously PCI: pciehp: Stop blinking on slot enable failure PCI: pciehp: Convert to threaded polling PCI: pciehp: Convert to threaded IRQ PCI: pciehp: Document struct slot and struct controller PCI: pciehp: Declare pciehp_unconfigure_device() void PCI: pciehp: Drop unnecessary NULL pointer check PCI: pciehp: Fix unprotected list iteration in IRQ handler PCI: pciehp: Fix use-after-free on unplug PCI: hotplug: Don't leak pci_slot on registration failure PCI: hotplug: Delete skeleton driver PCI: shpchp: Separate existence of SHPC and permission to use it	2018-08-15 14:58:52 -05:00
Bjorn Helgaas	1ca358a8e3	Merge branch 'pci/dpc' - Defer DPC event handling to work queue (Keith Busch) - Use threaded IRQ for DPC bottom half (Keith Busch) - Print AER status while handling DPC events (Keith Busch) * pci/dpc: PCI/DPC: Remove indirection waiting for inactive link PCI/DPC: Use threaded IRQ for bottom half handling PCI/DPC: Print AER status in DPC event handling PCI/DPC: Remove rp_pio_status from dpc struct PCI/DPC: Defer event handling to work queue PCI/DPC: Leave interrupts enabled while handling event	2018-08-15 14:58:51 -05:00
Bjorn Helgaas	187dacce19	Merge branch 'pci/aspm' - Use sysfs_match_string() to simplify ASPM sysfs parsing (Andy Shevchenko) - Remove unnecessary includes of <linux/pci-aspm.h> (Bjorn Helgaas) * pci/aspm: PCI: Remove unnecessary include of <linux/pci-aspm.h> iwlwifi: Remove unnecessary include of <linux/pci-aspm.h> ath9k: Remove unnecessary include of <linux/pci-aspm.h> igb: Remove unnecessary include of <linux/pci-aspm.h> PCI/ASPM: Convert to use sysfs_match_string() helper	2018-08-15 14:58:46 -05:00
Bjorn Helgaas	3c3ab37f4c	Merge branch 'pci/aer' - Decode AER errors with names similar to "lspci" (Tyler Baicar) - Expose AER statistics in sysfs (Rajat Jain) - Clear AER status bits selectively based on the type of recovery (Oza Pawandeep) - Honor "pcie_ports=native" even if HEST sets FIRMWARE_FIRST (Alexandru Gagniuc) - Don't clear AER status bits if we're using the "Firmware-First" strategy where firmware owns the registers (Alexandru Gagniuc) * pci/aer: PCI/AER: Don't clear AER bits if error handling is Firmware-First PCI/AER: Remove duplicate PCI_EXP_AER_FLAGS definition PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset PCI/AER: Clear device status bits during ERR_COR handling PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path PCI/AER: Factor out ERR_NONFATAL status bit clearing PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery PCI/AER: Clear only ERR_FATAL status bits during fatal recovery PCI/AER: Honor "pcie_ports=native" even if HEST sets FIRMWARE_FIRST PCI/AER: Add sysfs attributes for rootport cumulative stats PCI/AER: Add sysfs attributes to provide AER stats and breakdown PCI/AER: Define aer_stats structure for AER capable devices PCI/AER: Move internal declarations to drivers/pci/pci.h PCI/AER: Adopt lspci names for AER error decoding PCI/AER: Expose internal API for obtaining AER information # Conflicts: # drivers/pci/pci.h	2018-08-15 14:58:45 -05:00
Alexandru Gagniuc	45687f96c1	PCI/AER: Don't clear AER bits if error handling is Firmware-First If the platform requests Firmware-First error handling, firmware is responsible for reading and clearing AER status bits. If OSPM also clears them, we may miss errors. See ACPI v6.2, sec 18.3.2.5 and 18.4. This race is mostly of theoretical significance, as it is not easy to reasonably demonstrate it in testing. Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com> [bhelgaas: add similar guards to pci_cleanup_aer_uncorrect_error_status() and pci_aer_clear_fatal_status()] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-08-15 14:35:40 -05:00
Andy Shevchenko	36131ce9a0	PCI/ASPM: Convert to use sysfs_match_string() helper The sysfs_match_string() helper returns index of the matching string in an array. Use it in pcie_aspm_set_policy() to simplify the code. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> [bhelgaas: squash sysfs_match_string() fix into original patch for issue Reported-by: Heiner Kallweit <hkallweit1@gmail.com>] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-08-06 14:30:34 -05:00
Bjorn Helgaas	944d58595b	PCI/AER: Remove duplicate PCI_EXP_AER_FLAGS definition PCI_EXP_AER_FLAGS was defined twice (with identical definitions), once under #ifdef CONFIG_ACPI_APEI, and again at the top level. This looks like my merge error from these commits: `fd3362cb73` ("PCI/AER: Squash aerdrv_core.c into aerdrv.c") `41cbc9eb1a` ("PCI/AER: Squash ecrc.c into aerdrv.c") Remove the duplicate PCI_EXP_AER_FLAGS definition. Fixes: `41cbc9eb1a` ("PCI/AER: Squash ecrc.c into aerdrv.c") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-31 16:26:09 -05:00
Lukas Wunner	7903782460	PCI: pciehp: Clear spurious events earlier on resume Thunderbolt hotplug ports that were occupied before system sleep resume with their downstream link in "off" state. Only after the Thunderbolt controller has reestablished the PCIe tunnels does the link go up. As a result, a spurious Presence Detect Changed and/or Data Link Layer State Changed event occurs. The events are not immediately acted upon because tunnel reestablishment happens in the ->resume_noirq phase, when interrupts are still disabled. Also, notification of events may initially be disabled in the Slot Control register when coming out of system sleep and is reenabled in the ->resume_noirq phase through: pci_pm_resume_noirq() pci_pm_default_resume_early() pci_restore_state() pci_restore_pcie_state() It is not guaranteed that the events are acted upon at all: PCIe r4.0, sec 6.7.3.4 says that "a port may optionally send an MSI when there are hot-plug events that occur while interrupt generation is disabled, and interrupt generation is subsequently enabled." Note the "optionally". If an MSI is sent, pciehp will gratuitously turn the slot off and back on once the ->resume_early phase has commenced. If an MSI is not sent, the extant, unacknowledged events in the Slot Status register will prevent future notification of presence or link changes. Commit `13c65840fe` ("PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume") fixed the latter by clearing the events in the ->resume phase. Move this to the ->resume_noirq phase to also fix the gratuitous disable/enablement of the slot. The commit further restored the Slot Control register in the ->resume phase, but that's dispensable because as shown above it's already been done in the ->resume_noirq phase. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>	2018-07-31 11:07:59 -05:00
Lukas Wunner	6ccb127ba6	PCI: portdrv: Deduplicate PM callback iterator Replace suspend_iter() and resume_iter() with a single function pm_iter() to allow addition of port service callbacks for further power management phases without having to add another iterator each time. No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-31 11:07:59 -05:00
Thomas Tai	bd91b56cb3	PCI/AER: Work around use-after-free in pcie_do_fatal_recovery() When an fatal error is received by a non-bridge device, the device is removed, and pci_stop_and_remove_bus_device() deallocates the device structure. The freed device structure is used by subsequent code to send uevents and print messages. Hold a reference on the device until we're finished using it. This is not an ideal fix because pcie_do_fatal_recovery() should not use the device at all after removing it, but that's too big a project for right now. Fixes: `7e9084b367` ("PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices") Signed-off-by: Thomas Tai <thomas.tai@oracle.com> [bhelgaas: changelog, reduce get/put coverage] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-26 12:13:04 -05:00
Oza Pawandeep	89e1f5cb1e	PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset The pci_error_handlers.slot_reset() callback is only used for non-bridge devices (see broadcast_error_message()). Since portdrv only binds to bridges, we don't need pcie_portdrv_slot_reset(), so remove it. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog, remove pcie_portdrv_slot_reset() completely] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-20 15:27:13 -05:00
Oza Pawandeep	10d790d99d	PCI/AER: Clear device status bits during ERR_COR handling In case of correctable error, the Correctable Error Detected bit in the Device Status register is set. Clear it after handling the error. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-20 15:27:12 -05:00
Oza Pawandeep	ec752f5d54	PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL Clear the device status bits while handling both ERR_FATAL and ERR_NONFATAL cases. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: rename to pci_aer_clear_device_status(), declare internal to PCI core instead of exposing it everywhere] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-20 15:27:11 -05:00
Oza Pawandeep	43ec03a9e5	PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path broadcast_error_message() is only used for ERR_NONFATAL events, when the state is always pci_channel_io_normal, so remove the unused alternate path. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-20 15:27:10 -05:00
Oza Pawandeep	5b6c09660d	PCI/AER: Factor out ERR_NONFATAL status bit clearing aer_error_resume() clears all ERR_NONFATAL error status bits. This is exactly what pci_cleanup_aer_uncorrect_error_status(), so use that instead of duplicating the code. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: split to separate patch] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-20 15:27:09 -05:00
Oza Pawandeep	e7b0b847de	PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery pci_cleanup_aer_uncorrect_error_status() is called by driver .slot_reset() methods when handling ERR_NONFATAL errors. Previously this cleared all the bits, including ERR_FATAL bits. Since we're only handling ERR_NONFATAL errors, clear only the ERR_NONFATAL error status bits. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: split to separate patch] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-20 15:27:08 -05:00
Bjorn Helgaas	7ab92e89bf	PCI/AER: Clear only ERR_FATAL status bits during fatal recovery During recovery from fatal errors, we previously called pci_cleanup_aer_uncorrect_error_status(), which cleared all uncorrectable error status bits (both ERR_FATAL and ERR_NONFATAL). Instead, call a new pci_aer_clear_fatal_status() that clears only the ERR_FATAL bits (as indicated by the PCI_ERR_UNCOR_SEVER register). Based-on-patch-by: Oza Pawandeep <poza@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-20 15:27:07 -05:00
Sinan Kaya	381634cad1	PCI: Hide pci_reset_bridge_secondary_bus() from drivers Rename pci_reset_bridge_secondary_bus() to pci_bridge_secondary_bus_reset() and move the declaration from linux/pci.h to drivers/pci.h to be used internally in PCI directory only. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 18:04:23 -05:00
Sinan Kaya	1842623850	PCI: Handle error return from pci_reset_bridge_secondary_bus() Commit `01fd61c0b9` ("PCI: Add a return type for pci_reset_bridge_secondary_bus()") added a return value to the function to return if a device is accessible following a reset. Callers are not checking the value. Pass error code up high in the stack if device is not accessible. Fixes: `01fd61c0b9` ("PCI: Add a return type for pci_reset_bridge_secondary_bus()") Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 18:04:23 -05:00
Keith Busch	e77b8216a2	PCI/DPC: Remove indirection waiting for inactive link Simplify waiting for the contained link to become inactive, removing the indirection to a unnecessary DPC-specific handler. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:21:01 -05:00
Keith Busch	738c4e411d	PCI/DPC: Use threaded IRQ for bottom half handling Remove the work struct that was being used to handle a DPC event and use a threaded IRQ instead. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:21:01 -05:00
Keith Busch	8aefa9b0d9	PCI/DPC: Print AER status in DPC event handling A DPC enabled device suppresses ERR_(NON)FATAL messages, preventing the AER handler from reporting error details. If the DPC trigger reason says the downstream port detected the error, collect the AER uncorrectable status for logging, then clear the status. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:21:01 -05:00
Keith Busch	f1d16b1756	PCI/DPC: Remove rp_pio_status from dpc struct We don't need to save the rp pio status across multiple contexts as all DPC event handling occurs in a single work queue context. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:21:01 -05:00
Keith Busch	0c27e28f77	PCI/DPC: Defer event handling to work queue Move all event handling to the existing work queue, which will make it simpler to pass event information to the handler. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:21:01 -05:00
Keith Busch	f8d46c89c8	PCI/DPC: Leave interrupts enabled while handling event Now that the DPC driver clears the interrupt status before exiting the IRQ handler, we don't need to abuse the DPC control register to know if a shared interrupt is for a new DPC event: a DPC port can not trigger a second interrupt until the host clears the trigger status later in the work queue handler. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:20:59 -05:00
Alexandru Gagniuc	7af02fcd84	PCI/AER: Honor "pcie_ports=native" even if HEST sets FIRMWARE_FIRST According to the documentation, "pcie_ports=native", linux should use native AER and DPC services. While that is true for the _OSC method parsing, this is not the only place that is checked. Should the HEST list PCIe ports as firmware-first, linux will not use native services. This happens because aer_acpi_firmware_first() doesn't take 'pcie_ports' into account. This is wrong. DPC uses the same logic when it decides whether to load or not, so fixing this also fixes DPC not loading. Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com> [bhelgaas: return "false" from bool function (from kbuild robot)] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 16:19:53 -05:00
Rajat Jain	12833017e5	PCI/AER: Add sysfs attributes for rootport cumulative stats Add sysfs attributes for rootport statistics (that are cumulative of all the ERR_* messages seen on this PCI hierarchy). Signed-off-by: Rajat Jain <rajatja@google.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 16:19:52 -05:00
Rajat Jain	81aa5206f9	PCI/AER: Add sysfs attributes to provide AER stats and breakdown Add sysfs attributes to provide total and breakdown of the AERs seen, into different type of correctable, fatal and nonfatal errors: /sys/bus/pci/devices/<dev>/aer_dev_correctable /sys/bus/pci/devices/<dev>/aer_dev_fatal /sys/bus/pci/devices/<dev>/aer_dev_nonfatal Signed-off-by: Rajat Jain <rajatja@google.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 16:19:51 -05:00
Rajat Jain	db89ccbe52	PCI/AER: Define aer_stats structure for AER capable devices Define a structure to hold the AER statistics. There are 2 groups of statistics: dev_* counters that are to be collected for all AER capable devices and rootport_* counters that are collected for all (AER capable) rootports only. Allocate and free this structure when device is added or released (thus counters survive the lifetime of the device). Signed-off-by: Rajat Jain <rajatja@google.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 16:17:03 -05:00
Rajat Jain	60ed982a4e	PCI/AER: Move internal declarations to drivers/pci/pci.h Since pci_aer_init() and pci_no_aer() are used only internally, move their declarations to the PCI internal header file. Also, no one cares about return value of pci_aer_init(), so make it void. Signed-off-by: Rajat Jain <rajatja@google.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 16:17:03 -05:00
Tyler Baicar	bd237801fe	PCI/AER: Adopt lspci names for AER error decoding lspci uses abbreviated naming for AER error strings. Adopt the same naming convention for the AER printing so they match. Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:17:01 -05:00
Keith Busch	1e4511604d	PCI/AER: Expose internal API for obtaining AER information Export some common AER functions and structures for other PCI core drivers to use. Since this is making the function externally visible inside the PCI core, prepend "aer_" to the function name. Signed-off-by: Keith Busch <keith.busch@intel.com> [bhelgaas: move AER declarations from linux/aer.h to drivers/pci/pci.h] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:16:55 -05:00
Bjorn Helgaas	0b15f1e38f	PCI/AER: Use "PCI Express" consistently in Kconfig text Use "PCI Express" consistently in Kconfig text. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-06-11 08:11:47 -05:00
Bjorn Helgaas	4696b828ca	PCI/AER: Hoist aerdrv.c, aer_inject.c up to drivers/pci/pcie/ Hoist aerdrv.c, aer_inject.c up to drivers/pci/pcie/ so they're next to other PCIe service drivers. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-06-11 08:11:39 -05:00
Bjorn Helgaas	adc1f22f5a	PCI/AER: Squash Kconfig.debug into Kconfig Squash Kconfig.debug into Kconfig. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-06-11 08:11:32 -05:00
Bjorn Helgaas	23e672bc2a	PCI/AER: Move private AER things to aerdrv.c Most of the things in aerdrv.h are only used in aerdrv.c, so move them there. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-06-11 08:11:25 -05:00
Bjorn Helgaas	f53e7418c3	PCI/AER: Move aer_irq() declaration to portdrv.h The aer_irq() declaration is the only thing needed by aer_inject.c. Move it to portdrv.h so we eventually get rid of aerdrv.h completely. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-06-11 08:11:18 -05:00
Bjorn Helgaas	0544b04b79	PCI/AER: Move pcie_aer_get_firmware_first() to portdrv.h Move pcie_aer_get_firmware_first() to portdrv.h, where it can be more easily shared between AER and DPC. Then DPC no longer needs to include aer/aerdrv.h. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-06-11 08:11:11 -05:00
Bjorn Helgaas	16c33b1595	PCI/AER: Remove duplicate pcie_port_bus_type declaration pcie_port_bus_type is already declared in portdrv.h, so remove the unnecessary duplicate declaration in aerdrv.h. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-06-11 08:11:05 -05:00
Bjorn Helgaas	41cbc9eb1a	PCI/AER: Squash ecrc.c into aerdrv.c Squash ecrc.c into aerdrv.c. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-06-11 08:10:59 -05:00
Bjorn Helgaas	256a459370	PCI/AER: Squash aerdrv_acpi.c into aerdrv.c Squash aerdrv_acpi.c into aerdrv.c. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-06-11 08:10:53 -05:00
Bjorn Helgaas	0319c9a0ef	PCI/AER: Squash aerdrv_errprint.c into aerdrv.c Squash aerdrv_errprint.c into aerdrv.c. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-06-11 08:10:45 -05:00
Bjorn Helgaas	fd3362cb73	PCI/AER: Squash aerdrv_core.c into aerdrv.c Squash aerdrv_core.c into aerdrv.c. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-06-11 08:10:33 -05:00
Bjorn Helgaas	3c43a64cb3	PCI/AER: Reorder code to group probe/remove stuff together Reorder code to group probe/remove stuff together. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-06-11 08:10:27 -05:00
Bjorn Helgaas	0054ca8e10	PCI/AER: Remove forward declarations Reorder code to remove forward declarations. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-06-08 08:31:42 -05:00
Bjorn Helgaas	ae08aa13ba	Merge branch 'pci/portdrv' - remove unused pcie_port_acpi_setup() and portdrv_acpi.c (Bjorn Helgaas) * pci/portdrv: PCI/portdrv: Remove unused pcie_port_acpi_setup()	2018-06-06 16:10:17 -05:00
Bjorn Helgaas	f64c146410	Merge branch 'pci/hotplug' - fix use-before-set error in ibmphp (Dan Carpenter) - fix pciehp timeouts caused by Command Completed errata (Bjorn Helgaas) - fix refcounting in pnv_php hotplug (Julia Lawall) - clear pciehp Presence Detect and Data Link Layer Status Changed on resume so we don't miss hotplug events (Mika Westerberg) - only request pciehp control if we support it, so platform can use ACPI hotplug otherwise (Mika Westerberg) - convert SHPC to be builtin only (Mika Westerberg) - request SHPC control via _OSC if we support it (Mika Westerberg) - simplify SHPC handoff from firmware (Mika Westerberg) * pci/hotplug: PCI: Improve "partially hidden behind bridge" log message PCI: Improve pci_scan_bridge() and pci_scan_bridge_extend() doc PCI: Move resource distribution for single bridge outside loop PCI: Account for all bridges on bus when distributing bus numbers ACPI / hotplug / PCI: Drop unnecessary parentheses ACPI / hotplug / PCI: Mark stale PCI devices disconnected ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug PCI: hotplug: Add hotplug_is_native() PCI: shpchp: Add shpchp_is_native() PCI: shpchp: Fix AMD POGO identification PCI: shpchp: Use dev_printk() for OSHP-related messages PCI: shpchp: Remove get_hp_hw_control_from_firmware() wrapper PCI: shpchp: Remove acpi_get_hp_hw_control_from_firmware() flags PCI: shpchp: Rely on previous _OSC results PCI: shpchp: Request SHPC control via _OSC when adding host bridge PCI: shpchp: Convert SHPC to be builtin only PCI: pciehp: Make pciehp_is_native() stricter PCI: pciehp: Rename host->native_hotplug to host->native_pcie_hotplug PCI: pciehp: Request control of native hotplug only if supported PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume PCI: pnv_php: Add missing of_node_put() PCI: pciehp: Add quirk for Command Completed errata PCI: Add Qualcomm vendor ID PCI: ibmphp: Fix use-before-set in get_max_bus_speed() # Conflicts: # drivers/acpi/pci_root.c	2018-06-06 16:10:10 -05:00
Bjorn Helgaas	8e069da28d	Merge branch 'pci/dpc' - clear interrupt status in top half to avoid interrupt storm (Oza Pawandeep) * pci/dpc: PCI/DPC: Clear interrupt status in interrupt handler top half	2018-06-06 16:10:06 -05:00
Bjorn Helgaas	08b5b2f783	Merge branch 'pci/aspm' - disable ASPM L1.2 substate if we don't have LTR (Bjorn Helgaas) - respect platform ownership of LTR (Bjorn Helgaas) * pci/aspm: PCI/ACPI: Request LTR control from platform before using it PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR	2018-06-06 16:10:05 -05:00
Keith Busch	e13d17f732	PCI/AER: Replace struct pcie_device with pci_dev The AER driver only needed the pcie_device to get to the port pci_dev. Save the pci_dev pointer directly in struct aer_rpc and remove the unnecessary indirection. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-06-05 17:08:52 -05:00
Keith Busch	8f6fb67775	PCI/AER: Remove unused parameters Remove unused "struct pcie_device *" parameters to handle_error_source() and aer_process_err_devices(). No functional change intended. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-06-05 17:08:52 -05:00
Bjorn Helgaas	010caed4cc	PCI/AER: Decode Error Source Requester ID Decode the Requester ID from the AER Error Source Register into domain/ bus/device/function format to match other logging. In cases where the ID matches the device used for pci_err(), drop the extra ID completely so we don't print it twice. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-06-02 19:29:29 -05:00
Borislav Petkov	ad4050dcda	PCI/AER: Remove aer_recover_work_func() forward declaration Just move the actual function up so that it is visible to its user aer_recover_queue(). No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-06-02 19:29:27 -05:00
Oza Pawandeep	b09803b5e5	PCI/DPC: Use the generic pcie_do_fatal_recovery() path Our goal is to handle ERR_FATAL errors similarly, whether they are reported via AER or via DPC. A previous commit changed AER so it handles ERR_FATAL by calling driver .remove() methods and resetting the Link. DPC already does that (although the Link reset is done automatically by hardware and happens before we call the driver .remove() methods). Restructure the DPC code so it calls the same pcie_do_fatal_recovery() interface used by AER. This makes it clearer that we want to use the same path. Implement the .reset_link() method used by pcie_do_fatal_recovery(). For DPC, the actual reset is done automatically by hardware, so we really only have to wait for the Link to be inactive, then release the Port from DPC. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog, DPC_FATAL is not a bitfield, can be sequential] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-06-02 19:29:27 -05:00
Oza Pawandeep	0b91439d35	PCI/AER: Pass service type to pcie_do_fatal_recovery() Pass the service type to pcie_do_fatal_recovery() instead of assuming AER. We will make DPC also use pcie_do_fatal_recovery(), and it needs to do things a little differently for AER and DPC. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: split to separate patch] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-06-02 19:29:26 -05:00
Oza Pawandeep	6927868e7a	PCI/DPC: Disable ERR_NONFATAL handling by DPC PCIe ERR_NONFATAL errors mean a particular transaction is unreliable but the Link is otherwise fully functional (PCIe r4.0, sec 6.2.2). The AER driver handles these by logging the error details and calling driver-supplied pci_error_handlers callbacks. It does not reset downstream devices, does not remove them from the PCI subsystem, does not re-enumerate them, and does not call their driver .remove() or .probe() methods. But DPC driver previously enabled DPC on ERR_NONFATAL, so if the hardware supports DPC, these errors caused a Link reset (performed automatically by the hardware), followed by the DPC driver removing affected devices (which calls their .remove() methods), bringing the Link back up, and re-enumerating (which calls driver .probe() methods). Disable ERR_NONFATAL DPC triggering so these errors will only be handled by AER. This means drivers won't have to deal with different usage of their pci_error_handlers callbacks and .probe() and .remove() methods based on whether the platform has DPC support. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-06-02 19:29:25 -05:00
Oza Pawandeep	e76d596aef	PCI/portdrv: Add generic pcie_port_find_device() Add generic pcie_port_find_device() routine. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-06-02 19:29:24 -05:00
Mika Westerberg	5352a44a56	PCI: pciehp: Make pciehp_is_native() stricter Previously pciehp_is_native() returned true for any PCI device in a hierarchy where _OSC says we can use pciehp. This is incorrect because bridges without PCI_EXP_SLTCAP_HPC capability should be managed by acpiphp instead. Improve pciehp_is_native() to return true only when PCI_EXP_SLTCAP_HPC is set and the pciehp driver is present. In any other case return false to let acpiphp handle those. Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> [bhelgaas: remove NULL pointer check] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-06-02 00:18:28 -05:00
Mika Westerberg	9310f0dc1c	PCI: pciehp: Rename host->native_hotplug to host->native_pcie_hotplug Rename host->native_hotplug to host->native_pcie_hotplug to make room for a similar flag for SHPC hotplug. Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> [bhelgaas: split to separate patch] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-06-02 00:18:28 -05:00
Oza Pawandeep	f252d0621a	PCI/portdrv: Add generic pcie_port_find_service() Add generic pcie_port_find_service() routine. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-05-17 16:49:30 -05:00
Oza Pawandeep	2e28bc84cf	PCI/AER: Factor out error reporting to drivers/pci/pcie/err.c Move the error reporting callbacks from aerdrv_core.c to err.c, where they can be used by DPC in addition to AER. As part of aerdrv_core.c, these callbacks were built under CONFIG_PCIEAER. Moving them to the new err.c means they will now be built under CONFIG_PCIEPORTBUS, so adjust the definition of pci_uevent_ers() to match. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: in reset_link(), initialize "driver" even if CONFIG_PCIEAER is unset, update pci_uevent_ers() #ifdef wrapper] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-05-17 16:48:23 -05:00
Oza Pawandeep	d25e28e8d2	PCI/AER: Rename error recovery interfaces to generic PCI naming Rename error recovery interfaces with "pcie_" prefix so they can be made non-static. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: move declaration to later patch, leave functions static] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-05-17 16:48:15 -05:00
Oza Pawandeep	7e9084b367	PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices PCIe ERR_FATAL errors mean the Link is unreliable. Components on the Link may need to be reset to return to reliable operation (PCIe r4.0, sec 6.2.2). We previously handled these errors much differently depending on whether the platform supports Downstream Port Containment (DPC) (PCIe r4.0, sec 6.2.10) or not. The AER driver has historically logged the error details, called driver-supplied pci_error_handlers callbacks, and reset the Link. This reset downstream devices, but did not remove them from the PCI subsystem, re-enumerate them, or call their driver .remove() or .probe() methods. DPC is different because the hardware automatically disables the Link when it detects ERR_FATAL, which resets downstream devices. There's no opportunity for pci_error_handlers callbacks before resetting the Link. The DPC driver removes affected devices (which calls their driver .remove() methods), brings the Link back up, and re-enumerates (which calls driver .probe() methods). Align AER ERR_FATAL handling with DPC by resetting the Link in software, skipping the driver pci_error_handlers callbacks, removing the devices from the PCI subsystem, and re-enumerating. The idea is that drivers and devices should see the same behavior for ERR_FATAL events, regardless of whether they're handled by AER or DPC. Here are the basic ERR_FATAL recovery steps, showing the previous AER behavior, the AER behavior after this patch, and the DPC behavior: AER AER DPC previous new behavior -------- --- -------- Log error yes yes yes (minimal) drv.error_detected() yes no no Reset Link yes yes yes drv.mmio_enabled() yes no no drv.slot_reset() yes no no drv.resume() yes no no Remove PCI devices no yes yes (calls drv.remove()) Re-enumerate no yes yes (calls drv.probe()) N.B. With DPC, the Link reset happens before the driver .remove() calls, while with AER, the reset happens after the .remove() calls. The goal is to eventually do the reset before .remove() for AER as well. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog, squash doc patch into this, remove unused "result_data"] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-05-17 16:44:13 -05:00
Oza Pawandeep	9f5a70f18c	PCI: Add generic pcie_wait_for_link() interface Clients such as hotplug and Downstream Port Containment (DPC) both need to wait until a link becomes active or inactive. Add a generic pcie_wait_link_active() interface and use it instead of duplicating the code. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-05-17 16:44:11 -05:00
Oza Pawandeep	56abbf8ad7	PCI/DPC: Clear interrupt status in interrupt handler top half The generic IRQ handling code ensures that an interrupt handler runs with its interrupt masked or disabled. If the interrupt is level-triggered, the interrupt handler must tell its device to stop asserting the interrupt before returning. If it doesn't, we will immediately take the interrupt again when the handler returns and the generic code unmasks the interrupt. The driver doesn't know whether its interrupt is edge- or level-triggered, so it must clear its interrupt source directly in its interrupt handler. Previously we cleared the DPC interrupt status in the bottom half, i.e., in deferred work, which can cause an interrupt storm if the DPC interrupt happens to be level-triggered, e.g., if we're using INTx instead of MSI. Clear the DPC interrupt status bit in the interrupt handler, not in the deferred work. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <helgaas@kernel.org> Reviewed-by: Keith Busch <keith.busch@intel.com>	2018-05-16 15:59:35 -05:00
Thomas Tai	2af8641b2a	PCI/AER: Add TLP header information to tracepoint When a PCIe AER error occurs, the TLP header information is printed in the kernel message but it is missing from the tracepoint. A userspace program can use this information in the tracepoint to better analyze problems. To enable the tracepoint: echo 1 > /sys/kernel/debug/tracing/events/ras/aer_event/enable Example tracepoint output: $ cat /sys/kernel/debug/tracing/trace aer_event: 0000:01:00.0 PCIe Bus Error: severity=Uncorrected, non-fatal, Completer Abort TLP Header={0x0,0x1,0x2,0x3} Signed-off-by: Thomas Tai <thomas.tai@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2018-05-10 08:34:52 -05:00
Alexandru Gagniuc	5d0b401f4c	PCI/AER: Unify error bit printing for native and CPER reporting AER errors can be reported natively (Linux AER driver fields interrupts and reads error state directly from hardware) or via the ACPI/APEI/GHES/CPER path (platform firmware reads error state from hardware and sends it to Linux via ACPI interfaces). Previously the same error would produce different output depending on whether it was reported natively or via ACPI. The CPER path resulted in hard-to-understand messages, without a prefix. Instead use __aer_print_error() for both native AER and CPER to provide a more consistent log format. Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-05-07 17:07:21 -05:00
Bjorn Helgaas	5d20637b91	PCI/portdrv: Remove unused pcie_port_acpi_setup() `02bfeb4842` ("PCI/portdrv: Simplify PCIe feature permission checking") removed the only call of pcie_port_acpi_setup() and removed portdrv_acpi.o from the Makefile, but I forgot to remove pcie_port_acpi_setup() itself. Remove pcie_port_acpi_setup() and the drivers/pci/pcie/portdrv_acpi.c file. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-05-02 17:31:48 -05:00
Bjorn Helgaas	9ab105deb6	PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR When in the ASPM L1.0 state (but not the PCI-PM L1.0 state), the most recent LTR value and the LTR_L1.2_THRESHOLD determines whether the link enters the L1.2 substate. If we don't have LTR enabled, prevent the use of ASPM L1.2. PCI-PM L1.2 may still be used because it doesn't depend on LTR_L1.2_THRESHOLD (see PCIe r4.0, sec 5.5.1). Tested-by: Srinath Mannam <srinath.mannam@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-04-18 16:10:34 -05:00
Bjorn Helgaas	64ae499cf2	Merge branch 'pci/portdrv' - move pcieport_if.h to drivers/pci/pcie/ to encapsulate it (Frederick Lawler) - merge pcieport_if.h into portdrv.h (Bjorn Helgaas) - move workaround for BIOS PME issue from portdrv to PCI core (Bjorn Helgaas) - completely disable portdrv with "pcie_ports=compat" (Bjorn Helgaas) - remove portdrv link order dependency (Bjorn Helgaas) - remove support for unused VC portdrv service (Bjorn Helgaas) - simplify portdrv feature permission checking (Bjorn Helgaas) - remove "pcie_hp=nomsi" parameter (use "pci=nomsi" instead) (Bjorn Helgaas) - remove unnecessary "pcie_ports=auto" parameter (Bjorn Helgaas) - use cached AER capability offset (Frederick Lawler) - don't enable DPC if BIOS hasn't granted AER control (Mika Westerberg) - rename pcie-dpc.c to dpc.c (Bjorn Helgaas) * pci/portdrv: PCI/DPC: Rename from pcie-dpc.c to dpc.c PCI/DPC: Do not enable DPC if AER control is not allowed by the BIOS PCI/AER: Use cached AER Capability offset PCI/portdrv: Rename and reverse sense of pcie_ports_auto PCI/portdrv: Encapsulate pcie_ports_auto inside the port driver PCI/portdrv: Remove unnecessary "pcie_ports=auto" parameter PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter PCI/portdrv: Remove unnecessary include of <linux/pci-aspm.h> PCI/portdrv: Simplify PCIe feature permission checking PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC PCI/portdrv: Remove pcie_port_bus_type link order dependency PCI/portdrv: Disable port driver in compat mode PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver PCI/PM: Move pcie_clear_root_pme_status() to core PCI/portdrv: Merge pcieport_if.h into portdrv.h PCI/portdrv: Move pcieport_if.h to drivers/pci/pcie/ Conflicts: drivers/pci/pcie/Makefile drivers/pci/pcie/portdrv.h	2018-04-04 13:27:58 -05:00
Bjorn Helgaas	43b90eaed5	Merge branch 'pci/misc' - use PCI_EXP_DEVCTL2_COMP_TIMEOUT in rapidio/tsi721 (Bjorn Helgaas) - remove possible NULL pointer dereference in of_pci_bus_find_domain_nr() (Shawn Lin) - report quirk timings with dev_info (Bjorn Helgaas) - report quirks that take longer than 10ms (Bjorn Helgaas) - add and use Altera Vendor ID (Johannes Thumshirn) - tidy Makefiles and comments (Bjorn Helgaas) * pci/misc: PCI: Always define the of_node helpers PCI: Tidy comments PCI: Tidy Makefiles mcb: Add Altera PCI ID to mcb-pci PCI: Add Altera vendor ID PCI: Report quirks that take more than 10ms PCI: Report quirk timings with pci_info() instead of pr_debug() PCI: Fix NULL pointer dereference in of_pci_bus_find_domain_nr() rapidio/tsi721: use PCI_EXP_DEVCTL2_COMP_TIMEOUT macro	2018-04-04 13:27:45 -05:00
Bjorn Helgaas	e02602bd76	PCI/DPC: Rename from pcie-dpc.c to dpc.c Rename pcie-dpc.c to dpc.c. The path "drivers/pci/pcie/pcie-dpc.c" has more occurrences of "pci" than necessary. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-03-31 17:48:57 -05:00
Mika Westerberg	4e5fad429b	PCI/DPC: Do not enable DPC if AER control is not allowed by the BIOS Commit `eed85ff4c0` ("PCI/DPC: Enable DPC only if AER is available") made DPC control dependent whether AER is enabled in the OS. However, it does not take into account situations where BIOS has not given OS control of AER: acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI] acpi PNP0A08:00: _OSC: platform does not support [AER] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME PCIeCapability] I think here it is better not to enable DPC even if the capability is available because then it would be against what "Determination of DPC Control" note in PCIe 4.0 sec 6.1.10 recommends. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>	2018-03-30 17:33:34 -05:00
Frederick Lawler	f0553ba08a	PCI/AER: Use cached AER Capability offset Replace pci_find_ext_capability(..., PCI_EXT_CAP_ID_ERR) calls with pci_dev->aer_cap. pci_dev->aer_cap is initialized in pci_init_capabilities(), which happens before any of these users of the AER Capability. Signed-off-by: Frederick Lawler <fred@fredlawl.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-03-30 17:26:59 -05:00
Bjorn Helgaas	d850882b72	PCI/portdrv: Rename and reverse sense of pcie_ports_auto The platform may restrict the OS's use of PCIe services, e.g., via the ACPI _OSC method. The user may use "pcie_ports=native" to force the port driver to use PCIe services even if the platform asked us not to. The "pcie_ports=native" parameter determines the setting of pcie_ports_auto. Rename this to pcie_ports_native and reverse the sense to simplify the code. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-03-30 17:26:58 -05:00
Bjorn Helgaas	842b447f00	PCI/portdrv: Encapsulate pcie_ports_auto inside the port driver "pcie_ports_auto" is only used inside the PCIe port driver itself, so move it from include/linux/pci.h to portdrv.h so it's not visible to the whole kernel. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-03-30 17:26:58 -05:00
Bjorn Helgaas	4c0fd7648d	PCI/portdrv: Remove unnecessary "pcie_ports=auto" parameter The "pcie_ports=auto" parameter set pcie_ports_disabled and pcie_ports_auto to their compiled-in defaults, so specifying the parameter is the same as not using it at all. Remove the "pcie_ports=auto" parameter and update the documentation. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-03-30 17:26:57 -05:00
Bjorn Helgaas	1e447c57ae	PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter `7570a333d8` ("PCI: Add pcie_hp=nomsi to disable MSI/MSI-X for pciehp driver") added the "pcie_hp=nomsi" kernel parameter to work around this error on shutdown: irq 16: nobody cared (try booting with the "irqpoll" option) Pid: 1081, comm: reboot Not tainted 3.2.0 #1 ... Disabling IRQ #16 This happened on an unspecified system (possibly involving the Integrated Device Technology, Inc. Device 807f bridge) where "an un-wanted interrupt is generated when PCI driver switches from MSI/MSI-X to INTx while shutting down the device." The implication was that the device was buggy, but it is normal for a device to use INTx after MSI/MSI-X have been disabled. The only problem was that the driver was still attached and it wasn't prepared for INTx interrupts. Prarit Bhargava fixed this issue with `fda78d7a0e` ("PCI/MSI: Stop disabling MSI/MSI-X in pci_device_shutdown()"). There is no automated way to set this parameter, so it's not very useful for distributions or end users. It's really only useful for debugging, and we have "pci=nomsi" for that purpose. Revert `7570a333d8` to remove the "pcie_hp=nomsi" parameter. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> CC: MUNEDA Takahiro <muneda.takahiro@jp.fujitsu.com> CC: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> CC: Prarit Bhargava <prarit@redhat.com>	2018-03-30 17:26:56 -05:00
Bjorn Helgaas	1b64cb87cf	PCI/portdrv: Remove unnecessary include of <linux/pci-aspm.h> portdrv_pci.c doesn't use anything from <linux/pci-aspm.h>. Remove the include of it. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-03-30 17:26:55 -05:00
Bjorn Helgaas	02bfeb4842	PCI/portdrv: Simplify PCIe feature permission checking Some PCIe features (AER, DPC, hotplug, PME) can be managed by either the platform firmware or the OS, so the host bridge driver may have to request permission from the platform before using them. On ACPI systems, this is done by negotiate_os_control() in acpi_pci_root_add(). The PCIe port driver later uses pcie_port_platform_notify() and pcie_port_acpi_setup() to figure out whether it can use these features. But all we need is a single bit for each service, so these interfaces are needlessly complicated. Simplify this by adding bits in the struct pci_host_bridge to show when the OS has permission to use each feature: + unsigned int native_aer:1; /* OS may use PCIe AER / + unsigned int native_hotplug:1; / OS may use PCIe hotplug / + unsigned int native_pme:1; / OS may use PCIe PME */ These are set when we create a host bridge, and the host bridge driver can clear the bits corresponding to any feature the platform doesn't want us to use. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-03-30 17:26:54 -05:00
Bjorn Helgaas	168f3ae595	PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC No driver registers for PCIE_PORT_SERVICE_VC, so remove it. This removes the VC "service" files from /sys/bus/pci_express/devices, e.g., 0000:07:00.0:pcie108, 0000:08:04.0:pcie208 (all the files that contained "8" as the last digit of the "pcieXXX" part). The port driver created these files for PCIe port devices that have a VC Capability. Since this reduces PCIE_PORT_DEVICE_MAXSERVICES and moves DPC down into the spot where VC used to be, the DPC sysfs files will now be named "pcieXX8". I don't think there's anything useful userspace can do with those files, so I hope nobody cares about these filenames. There is no VC driver that calls pcie_port_service_register(), so there never was a /sys/bus/pci_express/drivers/vc directory. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-03-30 17:26:54 -05:00
Bjorn Helgaas	c6c889d932	PCI/portdrv: Remove pcie_port_bus_type link order dependency The pcie_port_bus_type must be registered before drivers that depend on it can be registered. Those drivers include: pcied_init() # PCIe native hotplug driver aer_service_init() # AER driver dpc_service_init() # DPC driver pcie_pme_service_init() # PME driver Previously we registered pcie_port_bus_type from pcie_portdrv_init(), a device_initcall. The callers of pcie_port_service_register() (above) are also device_initcalls. This is fragile because the device_initcall ordering depends on link order, which is not explicit. Register pcie_port_bus_type from pci_driver_init() along with pci_bus_type. This removes the link order dependency between portdrv and the pciehp, AER, DPC, and PCIe PME drivers. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-03-30 17:26:53 -05:00
Bjorn Helgaas	79a011194b	PCI/portdrv: Disable port driver in compat mode The "pcie_ports=compat" kernel parameter sets pcie_ports_disabled, which is intended to disable the PCIe port driver. But even when it was disabled, we registered pcie_portdriver so we could work around a BIOS PME issue (see `fe31e69740` ("PCI/PCIe: Clear Root PME Status bits early during system resume")). Registering the driver meant that the pcie_portdrv_probe() path called pci_enable_device(), pci_save_state(), pm_runtime_set_autosuspend_delay(), pm_runtime_use_autosuspend(), etc., even when the driver was disabled. We've since moved the BIOS PME workaround from the port driver to the core, so stop registering the PCIe port driver in compat mode. This means "pcie_ports=compat" will now be basically the same as turning off CONFIG_PCIEPORTBUS completely. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-03-30 17:26:52 -05:00

1 2 3 4 5 ...

677 Commits