When a device is hotplugged, Presence Detect and Link Up events often do
not occur simultaneously, but with a lag of a few milliseconds. Only
the first event received is relevant, the other one can be disregarded.
Moreover, Stefan Roese reports that on certain platforms, Link State and
Presence Detect may flap for up to 100 ms before stabilizing, suggesting
that such events should be disregarded for at least this long:
https://lkml.kernel.org/r/20180130084121.18653-1-sr@denx.de
On slot enablement, pciehp_check_link_status() waits for 100 ms per
PCIe r4.0, sec 6.7.3.3, then probes the hotplugged device's vendor
register for up to 1 second.
If this succeeds, the link is definitely up, so ignore any Presence
Detect or Link State events that occurred up to this point.
pciehp_check_link_status() then checks the Link Training bit in the
Link Status register. This is the final opportunity to detect
inaccessibility of the device and abort slot enablement. Any link
or presence change that occurs afterwards will cause the slot to be
disabled again immediately after attempting to enable it.
The astute reviewer may appreciate that achieving this behavior would be
more complicated had pciehp not just been converted to enable/disable
the slot exclusively from the IRQ thread: When the slot is enabled via
sysfs, each link or presence flap would otherwise cause the IRQ thread
to run and it would have to sense that those events are belonging to a
concurrent slot enablement operation and disregard them. It would be
much more difficult than this mere 3 line change.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stefan Roese <sr@denx.de>
No callers of pciehp_enable/disable_slot() outside of pciehp_ctrl.c
remain, so declare the functions static. For now this requires forward
declarations. Those can be eliminated by reshuffling functions once the
ongoing effort to refactor the driver has settled.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Previously slot enablement and disablement could happen concurrently.
But now it's under the exclusive control of the IRQ thread, rendering
the locking obsolete. Drop it.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Besides the IRQ thread, there are several other places in the driver
which enable or disable the slot:
- pciehp_probe() enables the slot if it's occupied and the pciehp_force
module parameter is used.
- pciehp_resume() enables or disables the slot after system sleep.
- pciehp_queue_pushbutton_work() enables or disables the slot after the
5 second delay following an Attention Button press.
- pciehp_sysfs_enable_slot() and pciehp_sysfs_disable_slot() enable or
disable the slot on sysfs write.
This requires locking and complicates pciehp's state machine.
A simplification can be achieved by enabling and disabling the slot
exclusively from the IRQ thread.
Amend the functions listed above to request slot enable/disablement from
the IRQ thread by either synthesizing a Presence Detect Changed event or,
in the case of a disable user request (via sysfs or an Attention Button
press), submitting a newly introduced force disable request. The latter
is needed because the slot shall be forced off despite being occupied.
For this force disable request, avoid colliding with Slot Status register
bits by using a bit number greater than 16.
For synchronous execution of requests (on sysfs write), wait for the
request to finish and retrieve the result. There can only ever be one
sysfs write in flight due to the locking in kernfs_fop_write(), hence
there is no risk of returning the result of a different sysfs request to
user space.
The POWERON_STATE and POWEROFF_STATE is now no longer entered by the
above-listed functions, but solely by the IRQ thread when it begins a
power transition. Afterwards, it moves to STATIC_STATE. The same
applies to canceling the Attention Button work, it likewise becomes an
IRQ thread only operation.
An immediate consequence is that the POWERON_STATE and POWEROFF_STATE is
never observed by the IRQ thread itself, only by functions called in a
different context, such as pciehp_sysfs_enable_slot(). So remove
handling of these states from pciehp_handle_button_press() and
pciehp_handle_link_change() which are exclusively called from the IRQ
thread.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
handle_button_press_event() currently determines whether the slot has
been turned on or off by looking at the Power Controller Control bit in
the Slot Control register. This assumes that an attention button
implies presence of a power controller even though that's not mandated
by the spec. Moreover the Power Controller Control bit is unreliable
when a power fault occurs (PCIe r4.0, sec 6.7.1.8). This issue has
existed since the driver was introduced in 2004.
Fix by replacing STATIC_STATE with ON_STATE and OFF_STATE and tracking
whether the slot has been turned on or off. This is also a required
ingredient to make pciehp resilient to missed events, which is the
object of an upcoming commit.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The PCI hotplug core has just been refactored to separate slot
initialization for in-kernel use from publication to user space.
Take advantage of it in pciehp by publishing to user space last on
probe. This will allow enable/disablement of the slot exclusively from
the IRQ thread because the IRQ is requested after initialization for
in-kernel use (thereby getting its unique name needed by the IRQ thread)
but before user space is able to submit enable/disable requests.
On teardown, the order is the same in reverse: The user space interface
is removed prior to freeing the IRQ and destroying the slot.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When a hotplug driver calls pci_hp_register(), all steps necessary for
registration are carried out in one go, including creation of a kobject
and addition to sysfs. That's a problem for pciehp once it's converted
to enable/disable the slot exclusively from the IRQ thread: The thread
needs to be spawned after creation of the kobject (because it uses the
kobject's name), but before addition to sysfs (because it will handle
enable/disable requests submitted via sysfs).
pci_hp_deregister() does offer a ->release callback that's invoked
after deletion from sysfs and before destruction of the kobject. But
because pci_hp_register() doesn't offer a counterpart, hotplug drivers'
->probe and ->remove code becomes asymmetric, which is error prone
as recently discovered use-after-free bugs in pciehp's ->remove hook
have shown.
In a sense, this appears to be a case of the midlayer antipattern:
"The core thesis of the "midlayer mistake" is that midlayers are
bad and should not exist. That common functionality which it is
so tempting to put in a midlayer should instead be provided as
library routines which can [be] used, augmented, or ignored by
each bottom level driver independently. Thus every subsystem
that supports multiple implementations (or drivers) should
provide a very thin top layer which calls directly into the
bottom layer drivers, and a rich library of support code that
eases the implementation of those drivers. This library is
available to, but not forced upon, those drivers."
-- Neil Brown (2009), https://lwn.net/Articles/336262/
The presence of midlayer traits in the PCI hotplug core might be ascribed
to its age: When it was introduced in February 2002, the blessings of a
library approach might not have been well known:
https://git.kernel.org/tglx/history/c/a8a2069f432c
For comparison, the driver core does offer split functions for creating
a kobject (device_initialize()) and addition to sysfs (device_add()) as
an alternative to carrying out everything at once (device_register()).
This was introduced in October 2002:
https://git.kernel.org/tglx/history/c/8b290eb19962
The odd ->release callback in the PCI hotplug core was added in 2003:
https://git.kernel.org/tglx/history/c/69f8d663b595
Clearly, a library approach would not force every hotplug driver to
implement a ->release callback, but rather allow the driver to remove
the sysfs files, release its data structures and finally destroy the
kobject. Alternatively, a driver may choose to remove everything with
pci_hp_deregister(), then release its data structures.
To this end, offer drivers pci_hp_initialize() and pci_hp_add() as a
split-up version of pci_hp_register(). Likewise, offer pci_hp_del()
and pci_hp_destroy() as a split-up version of pci_hp_deregister().
Eliminate the ->release callback and move its code into each driver's
teardown routine.
Declare pci_hp_deregister() void, in keeping with the usual kernel
pattern that enablement can fail, but disablement cannot. It only
returned an error if the caller passed in a NULL pointer or a slot which
has never or is no longer registered or is sharing its name with another
slot. Those would be bugs, so WARN about them. Few hotplug drivers
actually checked the return value and those that did only printed a
useless error message to dmesg. Remove that.
For most drivers the conversion was straightforward since it doesn't
matter whether the code in the ->release callback is executed before or
after destruction of the kobject. But in the case of ibmphp, it was
unclear to me whether setting slot_cur->ctrl and slot_cur->bus_on to
NULL needs to happen before the kobject is destroyed, so I erred on
the side of caution and ensured that the order stays the same. Another
nontrivial case is pnv_php, I've found the list and kref logic difficult
to understand, however my impression was that it is safe to delete the
list element and drop the references until after the kobject is
destroyed.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> # drivers/platform/x86
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Scott Murray <scott@spiteful.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Corentin Chary <corentin.chary@gmail.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Andy Shevchenko <andy@infradead.org>
Previously the slot workqueue was used to handle events and enable or
disable the slot. That's no longer the case as those tasks are done
synchronously in the IRQ thread. The slot workqueue is thus merely used
to handle a button press after the 5 second delay and only one such work
item may be in flight at any given time. A separate workqueue isn't
necessary for this simple task, so use the system workqueue instead.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Up until now, pciehp's IRQ handler schedules a work item for each event,
which in turn schedules a work item to enable or disable the slot. This
double indirection was necessary because sleeping wasn't allowed in the
IRQ handler.
However it is now that pciehp has been converted to threaded IRQ handling
and polling, so handle events synchronously in pciehp_ist() and remove
the work item infrastructure (with the exception of work items to handle
a button press after the 5 second delay).
For link or presence change events, move the register read to determine
the current link or presence state behind acquisition of the slot lock
to prevent it from becoming stale while the lock is contended.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If the attention button is pressed to power on the slot AND the user
powers on the slot via sysfs before 5 seconds have elapsed AND powering
on the slot fails because either the slot is unoccupied OR the latch is
open, we neglect turning off the green LED so it keeps on blinking.
That's because the error path of pciehp_sysfs_enable_slot() doesn't call
pciehp_green_led_off(), unlike pciehp_power_thread() which does.
The bug has been present since 2004 when the driver was introduced.
Fix by deduplicating common code in pciehp_sysfs_enable_slot() and
pciehp_power_thread() into a wrapper function pciehp_enable_slot() and
renaming the existing function to __pciehp_enable_slot(). Same for
pciehp_disable_slot(). This will also simplify the upcoming rework of
pciehp's event handling.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We've just converted pciehp to threaded IRQ handling, but still cannot
sleep in pciehp_ist() because the function is also called in poll mode,
which runs in softirq context (from a timer).
Convert poll mode to a kthread so that pciehp_ist() always runs in task
context.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
pciehp's IRQ handler queues up a work item for each event signaled by
the hardware. A more modern alternative is to let a long running
kthread service the events. The IRQ handler's sole job is then to check
whether the IRQ originated from the device in question, acknowledge its
receipt to the hardware to quiesce the interrupt and wake up the kthread.
One benefit is reduced latency to handle the IRQ, which is a necessity
for realtime environments. Another benefit is that we can make pciehp
simpler and more robust by handling events synchronously in process
context, rather than asynchronously by queueing up work items. pciehp's
usage of work items is a historic artifact, it predates the introduction
of threaded IRQ handlers by two years. (The former was introduced in
2007 with commit 5d386e1ac4 ("pciehp: Event handling rework"), the
latter in 2009 with commit 3aa551c9b4 ("genirq: add threaded interrupt
handler support").)
Convert pciehp to threaded IRQ handling by retrieving the pending events
in pciehp_isr(), saving them for later consumption by the thread handler
pciehp_ist() and clearing them in the Slot Status register.
By clearing the Slot Status (and thereby acknowledging the events) in
pciehp_isr(), we can avoid requesting the IRQ with IRQF_ONESHOT, which
would have the unpleasant side effect of starving devices sharing the
IRQ until pciehp_ist() has finished.
pciehp_isr() does not count how many times each event occurred, but
merely records the fact *that* an event occurred. If the same event
occurs a second time before pciehp_ist() is woken, that second event
will not be recorded separately, which is problematic according to
commit fad214b0aa ("PCI: pciehp: Process all hotplug events before
looking for new ones") because we may miss removal of a card in-between
two back-to-back insertions. We're about to make pciehp_ist() resilient
to missed events. The present commit regresses the driver's behavior
temporarily in order to separate the changes into reviewable chunks.
This doesn't affect regular slow-motion hotplug, only plug-unplug-plug
operations that happen in a timespan shorter than wakeup of the IRQ
thread.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mayurkumar Patel <mayurkumar.patel@intel.com>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Document the driver's data structures to lower the barrier to entry for
contributors.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Since commit 0f4bd8014d ("PCI: hotplug: Drop checking of PCI_BRIDGE_
CONTROL in *_unconfigure_device()"), pciehp_unconfigure_device() can no
longer fail, so declare it and its sole caller remove_board() void, in
keeping with the usual kernel pattern that enablement can fail, but
disablement cannot. No functional change intended.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
pciehp_disable_slot() checks if the ctrl attribute of the slot is NULL
and bails out if so. However the function is not called prior to the
attribute being set in pcie_init_slot(), and pcie_init_slot() is not
called if ctrl is NULL. So the check is unnecessary. Drop it.
It has been present ever since the driver was introduced in 2004, but it
was already unnecessary back then:
https://git.kernel.org/tglx/history/c/c16b4b14d980
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Commit b440bde74f ("PCI: Add pci_ignore_hotplug() to ignore hotplug
events for a device") iterates over the devices on a hotplug port's
subordinate bus in pciehp's IRQ handler without acquiring pci_bus_sem.
It is thus possible for a user to cause a crash by concurrently
manipulating the device list, e.g. by disabling slot power via sysfs
on a different CPU or by initiating a remove/rescan via sysfs.
This can't be fixed by acquiring pci_bus_sem because it may sleep.
The simplest fix is to avoid the list iteration altogether and just
check the ignore_hotplug flag on the port itself. This works because
pci_ignore_hotplug() sets the flag both on the device as well as on its
parent bridge.
We do lose the ability to print the name of the device blocking hotplug
in the debug message, but that's probably bearable.
Fixes: b440bde74f ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
When pciehp is unbound (e.g. on unplug of a Thunderbolt device), the
hotplug_slot struct is deregistered and thus freed before freeing the
IRQ. The IRQ handler and the work items it schedules print the slot
name referenced from the freed structure in various informational and
debug log messages, each time resulting in a quadruple dereference of
freed pointers (hotplug_slot -> pci_slot -> kobject -> name).
At best the slot name is logged as "(null)", at worst kernel memory is
exposed in logs or the driver crashes:
pciehp 0000:10:00.0:pcie204: Slot((null)): Card not present
An attacker may provoke the bug by unplugging multiple devices on a
Thunderbolt daisy chain at once. Unplugging can also be simulated by
powering down slots via sysfs. The bug is particularly easy to trigger
in poll mode.
It has been present since the driver's introduction in 2004:
https://git.kernel.org/tglx/history/c/c16b4b14d980
Fix by rearranging teardown such that the IRQ is freed first. Run the
work items queued by the IRQ handler to completion before freeing the
hotplug_slot struct by draining the work queue from the ->release_slot
callback which is invoked by pci_hp_deregister().
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v2.6.4
If addition of sysfs files fails on registration of a hotplug slot, the
struct pci_slot as well as the entry in the slot_list is leaked. The
issue has been present since the hotplug core was introduced in 2002:
https://git.kernel.org/tglx/history/c/a8a2069f432c
Perhaps the idea was that even though sysfs addition fails, the slot
should still be usable. But that's not how drivers use the interface,
they abort probe if a non-zero value is returned.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v2.4.15+
Cc: Greg Kroah-Hartman <greg@kroah.com>
Ten years ago, commit 58319b802a ("PCI: Hotplug core: remove 'name'")
dropped the name element from struct hotplug_slot but neglected to update
the skeleton driver.
That same year, commit f46753c5e3 ("PCI: introduce pci_slot") raised the
number of arguments to pci_hp_register() from one to four.
Fourteen years ago, historic commit 7ab60fc1b8e7 ("PCI Hotplug skeleton:
final cleanups") removed all usages of the retval variable from
pcihp_skel_init() but not the variable itself, provoking a compiler
warning: https://git.kernel.org/tglx/history/c/7ab60fc1b8e7
It seems fair to assume the driver hasn't been used as a template for a new
driver in a while. Per Bjorn's and Christoph's preference, delete it.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Christoph Hellwig <hch@lst.de>
The pci_error_handlers.slot_reset() callback is only used for non-bridge
devices (see broadcast_error_message()). Since portdrv only binds to
bridges, we don't need pcie_portdrv_slot_reset(), so remove it.
Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog, remove pcie_portdrv_slot_reset() completely]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In case of correctable error, the Correctable Error Detected bit in the
Device Status register is set. Clear it after handling the error.
Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Clear the device status bits while handling both ERR_FATAL and ERR_NONFATAL
cases.
Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: rename to pci_aer_clear_device_status(), declare internal to PCI
core instead of exposing it everywhere]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
broadcast_error_message() is only used for ERR_NONFATAL events, when the
state is always pci_channel_io_normal, so remove the unused alternate path.
Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
aer_error_resume() clears all ERR_NONFATAL error status bits. This is
exactly what pci_cleanup_aer_uncorrect_error_status(), so use that instead
of duplicating the code.
Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: split to separate patch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_cleanup_aer_uncorrect_error_status() is called by driver .slot_reset()
methods when handling ERR_NONFATAL errors. Previously this cleared *all*
the bits, including ERR_FATAL bits.
Since we're only handling ERR_NONFATAL errors, clear only the ERR_NONFATAL
error status bits.
Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: split to separate patch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
During recovery from fatal errors, we previously called
pci_cleanup_aer_uncorrect_error_status(), which cleared *all* uncorrectable
error status bits (both ERR_FATAL and ERR_NONFATAL).
Instead, call a new pci_aer_clear_fatal_status() that clears only the
ERR_FATAL bits (as indicated by the PCI_ERR_UNCOR_SEVER register).
Based-on-patch-by: Oza Pawandeep <poza@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Now that the old implementation of pci_reset_bus() is gone, replace
pci_try_reset_bus() with pci_reset_bus().
Compared to the old implementation, new code will fail immmediately with
-EAGAIN if object lock cannot be obtained.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_reset_bus() and pci_reset_slot() functions are not being used by any
code. Remove them from the kernel in favor of pci_try_reset_bus() and
pci_try_reset_slot() functions.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Drivers are expected to call pci_try_reset_slot() or pci_try_reset_bus() by
querying if a system supports hotplug or not. A survey showed that most
drivers don't do this and we are leaking hotplug capability to the user.
Hide pci_try_slot_reset() from drivers and embed into pci_try_bus_reset().
Change pci_try_reset_bus() parameter from struct pci_bus to struct pci_dev.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Rename pci_reset_bridge_secondary_bus() to pci_bridge_secondary_bus_reset()
and move the declaration from linux/pci.h to drivers/pci.h to be used
internally in PCI directory only.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Commit 01fd61c0b9 ("PCI: Add a return type for
pci_reset_bridge_secondary_bus()") added a return value to the function to
return if a device is accessible following a reset. Callers are not
checking the value.
Pass error code up high in the stack if device is not accessible.
Fixes: 01fd61c0b9 ("PCI: Add a return type for pci_reset_bridge_secondary_bus()")
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Simplify waiting for the contained link to become inactive, removing the
indirection to a unnecessary DPC-specific handler.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
Reviewed-by: Oza Pawandeep <poza@codeaurora.org>
Remove the work struct that was being used to handle a DPC event and use a
threaded IRQ instead.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
Reviewed-by: Oza Pawandeep <poza@codeaurora.org>
A DPC enabled device suppresses ERR_(NON)FATAL messages, preventing the AER
handler from reporting error details. If the DPC trigger reason says the
downstream port detected the error, collect the AER uncorrectable status
for logging, then clear the status.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
Reviewed-by: Oza Pawandeep <poza@codeaurora.org>
We don't need to save the rp pio status across multiple contexts as all
DPC event handling occurs in a single work queue context.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
Reviewed-by: Oza Pawandeep <poza@codeaurora.org>
Move all event handling to the existing work queue, which will
make it simpler to pass event information to the handler.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
Reviewed-by: Oza Pawandeep <poza@codeaurora.org>
Now that the DPC driver clears the interrupt status before exiting the
IRQ handler, we don't need to abuse the DPC control register to know if
a shared interrupt is for a new DPC event: a DPC port can not trigger
a second interrupt until the host clears the trigger status later in the
work queue handler.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
Reviewed-by: Oza Pawandeep <poza@codeaurora.org>
According to the documentation, "pcie_ports=native", linux should use
native AER and DPC services. While that is true for the _OSC method
parsing, this is not the only place that is checked. Should the HEST
list PCIe ports as firmware-first, linux will not use native services.
This happens because aer_acpi_firmware_first() doesn't take 'pcie_ports'
into account. This is wrong. DPC uses the same logic when it decides
whether to load or not, so fixing this also fixes DPC not loading.
Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
[bhelgaas: return "false" from bool function (from kbuild robot)]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add sysfs attributes for rootport statistics (that are cumulative of all
the ERR_* messages seen on this PCI hierarchy).
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add sysfs attributes to provide total and breakdown of the AERs seen,
into different type of correctable, fatal and nonfatal errors:
/sys/bus/pci/devices/<dev>/aer_dev_correctable
/sys/bus/pci/devices/<dev>/aer_dev_fatal
/sys/bus/pci/devices/<dev>/aer_dev_nonfatal
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Define a structure to hold the AER statistics. There are 2 groups of
statistics: dev_* counters that are to be collected for all AER capable
devices and rootport_* counters that are collected for all (AER capable)
rootports only. Allocate and free this structure when device is added or
released (thus counters survive the lifetime of the device).
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Since pci_aer_init() and pci_no_aer() are used only internally, move their
declarations to the PCI internal header file. Also, no one cares about
return value of pci_aer_init(), so make it void.
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
lspci uses abbreviated naming for AER error strings. Adopt the same naming
convention for the AER printing so they match.
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Oza Pawandeep <poza@codeaurora.org>
Export some common AER functions and structures for other PCI core drivers
to use. Since this is making the function externally visible inside the
PCI core, prepend "aer_" to the function name.
Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: move AER declarations from linux/aer.h to drivers/pci/pci.h]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
Reviewed-by: Oza Pawandeep <poza@codeaurora.org>
Add pci_epc_set_msi() maximum 32 interrupts validation.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Add MSI-X support and update driver documentation accordingly.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Cleanup PCI_ENDPOINT_TEST memspace (by moving the interrupt number away
from command section).
Add IRQ_TYPE register to identify the triggered ID interrupt required
for the READ/WRITE/COPY tests and raise IRQ test commands.
Update documentation accordingly.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Currently DesignWare IP does not handle legacy interrupts.
Add a legacy interrupt callback handler.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Remove duplicate defines located on pcie-designware.h file already
available on /include/uapi/linux/pci-regs.h file.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Add PCIe config space capability search function.
Add sysfs set/get interface to allow the change of EP MSI-X maximum number.
Add EP MSI-X callback for triggering interruptions.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Change {cdns, dra7xx, artpec6, dw, rockchip}_pcie_ep_raise_irq() and
pci_epc_raise_irq() signature, namely the interrupt_num variable type
from u8 to u16 to accommodate 2048 maximum MSI-X interrupts.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Alan Douglas <adouglas@cadence.com>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Joao Pinto <jpinto@synopsys.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Move specific features settings from EP shared code
(pcie-designware-ep.c) to the driver (pcie-designware-plat.c).
Previous implementation disables the EP link notification
by default for all SoCs that uses EP DesignWare IP, which affects
directly the dra7xx and artpec6 SoCs.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Variable mmc is being assigned but is never used hence it is redundant
and can be removed.
Cleans up clang warning:
warning: variable 'mmc' set but not used [-Wunused-but-set-variable]
Signed-off-by: Colin Ian King <colin.king@canonical.com>
[lorenzo.pieralisi@arm.com: reworked commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Alan Douglas <adouglas@cadence.com>
Function dw_pcie_host_init() already initializes the root_bus_nr field
of 'struct pcie_port', so the -1 assignment prior to calling
dw_pcie_host_init() in platform specific driver is not really needed.
Drop it.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Pratyush Anand <pratyush.anand@gmail.com>
Function dw_pcie_host_init() already initializes the root_bus_nr field
of 'struct pcie_port', so the -1 assignment prior to calling
dw_pcie_host_init() in platform specific driver is not really needed.
Drop it.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Function dw_pcie_host_init() already initializes the root_bus_nr field
of 'struct pcie_port', so the -1 assignment prior to calling
dw_pcie_host_init() in platform specific driver is not really needed.
Drop it.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Jianguo Sun <sunjianguo1@huawei.com>
Function dw_pcie_host_init() already initializes the root_bus_nr field
of 'struct pcie_port', so the -1 assignment prior to calling
dw_pcie_host_init() in platform specific driver is not really needed.
Drop it.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Cc: Joao Pinto <Joao.Pinto@synopsys.com>
Function dw_pcie_host_init() already initializes the root_bus_nr field
of 'struct pcie_port', so the -1 assignment prior to calling
dw_pcie_host_init() in platform specific driver is not really needed.
Drop it.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Function dw_pcie_host_init() already initializes the root_bus_nr field
of 'struct pcie_port', so the -1 assignment prior to calling
dw_pcie_host_init() in platform specific driver is not really needed.
Drop it.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Function dw_pcie_host_init() already initializes the root_bus_nr field
of 'struct pcie_port', so the -1 assignment prior to calling
dw_pcie_host_init() in platform specific driver is not really needed.
Drop it.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Murali Karicheri <m-karicheri2@ti.com>
Function dw_pcie_host_init() already initializes the root_bus_nr field
of 'struct pcie_port', so the -1 assignment prior to calling
dw_pcie_host_init() in platform specific driver is not really needed.
Drop it.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Cc: Richard Zhu <hongxing.zhu@nxp.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Function dw_pcie_host_init() already initializes the root_bus_nr field
of 'struct pcie_port', so the -1 assignment prior to calling
dw_pcie_host_init() in platform specific driver is not really needed.
Drop it.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Jingoo Han <jingoohan1@gmail.com>
Reduce inbound/outbound mapping print level from dev_info() to
dev_dbg(). This reduces the console logs during Linux boot process.
Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Reviewed-by: Oza Pawandeep <poza@codeaurora.org>
PAXC is an emulated PCIe root complex internally in various Broadcom
based SoCs. PAXC internally connects to the embedded network processor
within these SoCs, with the embedeed network processor exposed as an
endpoint device.
The number of physical functions from the embedded network processor
that can be accessed depends on the firmware configuration.
Unfortunately, due to an ASIC bug, unconfigured physical functions cannot
be properly hidden from the root complex during enumerattion. As a
result, config write access to these unconfigured physical functions
during enumeration will cause a bus lock up on the embedded network
processor.
Fortunately, these unconfigured physical functions contain a very
specific, staled PCIe device ID 0x168e. By making use of this device ID,
one is able to terminate the enumeration early in the vendor/device ID
config read.
Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Reviewed-by: Oza Pawandeep <poza@codeaurora.org>
The internal MSI parsing logic in certain revisions of PAXC root
complexes does not work properly and can cause corruptions on the
writes transactions so they need to be disabled.
Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Reviewed-by: Oza Pawandeep <poza@codeaurora.org>
On certain versions of Broadcom PAXC based root complexes, certain
regions of the configuration space are corrupted. As a result, it
prevents the Linux PCIe stack from traversing the linked list of the
capability registers completely and therefore the root complex is
not advertised as "PCIe capable". This prevents the correct PCIe RID
from being parsed in the kernel PCIe stack. A correct RID is required
for mapping to a stream ID from the SMMU or the device ID from the
GICv3 ITS.
This patch fixes up the issue by manually populating the related
PCIe capabilities.
Signed-off-by: Ray Jui <rjui@broadcom.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Oza Pawandeep <poza@codeaurora.org>
Activate PAXC bridge quirk for more PAXC based PCIe root complex with
the following PCIe device ID:
0xd750, 0xd802, 0xd804
Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Oza Pawandeep <poza@codeaurora.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where
we are expecting to fall through.
Warning level 2 was used: -Wimplicit-fallthrough=2
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some IDT switches incorrectly flag an ACS Source Validation error on
completions for config read requests even though PCIe r4.0, sec 6.12.1.1,
says that completions are never affected by ACS Source Validation. Here's
the text of IDT 89H32H8G3-YC, erratum #36:
Item #36 - Downstream port applies ACS Source Validation to Completions
Section 6.12.1.1 of the PCI Express Base Specification 3.1 states that
completions are never affected by ACS Source Validation. However,
completions received by a downstream port of the PCIe switch from a
device that has not yet captured a PCIe bus number are incorrectly
dropped by ACS Source Validation by the switch downstream port.
Workaround: Issue a CfgWr1 to the downstream device before issuing the
first CfgRd1 to the device. This allows the downstream device to capture
its bus number; ACS Source Validation no longer stops completions from
being forwarded by the downstream port. It has been observed that
Microsoft Windows implements this workaround already; however, some
versions of Linux and other operating systems may not.
When doing the first config read to probe for a device, if the device is
behind an IDT switch with this erratum:
1. Disable ACS Source Validation if enabled
2. Wait for device to become ready to accept config accesses (by using
the Config Request Retry Status mechanism)
3. Do a config write to the endpoint
4. Enable ACS Source Validation (if it was enabled to begin with)
The workaround suggested by IDT is basically only step 3, but we don't know
when the device is ready to accept config requests. That means we need to
do config reads until we receive a non-Config Request Retry Status, which
means we need to disable ACS SV temporarily.
Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
[bhelgaas: changelog, clean up whitespace, fold in unused variable fix
from Anders Roxell <anders.roxell@linaro.org>]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Add shutdown callback to host driver which will disable PHY and
PM runtime.
Signed-off-by: Alan Douglas <adouglas@cadence.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
These PM ops will enable/disable the optional PHYs if present. The
AXI link-down register in the host driver is now cleared in
cdns_pci_map_bus() since the link-down bit will be set if the PHY has
been disabled. It is not cleared when enabling the PHY, since the
link will not yet be up (e.g. when an EP controller is connected
back-to-back to the host controller and its PHY is still disabled).
Link: http://lkml.kernel.org/r/1529915453-4633-5-git-send-email-adouglas@cadence.com
Signed-off-by: Alan Douglas <adouglas@cadence.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Add support for MSI to the kirin host controller driver, based
on the generic dwc infrastructure.
Signed-off-by: Xiaowei Song <songxiaowei@hisilicon.com>
Signed-off-by: Yao Chen <chenyao11@huawei.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
If PHYs are present, initialize and enable them at driver probe.
Signed-off-by: Alan Douglas <adouglas@cadence.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
cdns_pcie_writel() writes a long value; change the value parameter type
from u16 to u32 to rectify the function signature and related behaviour.
Signed-off-by: Alan Douglas <adouglas@cadence.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
commit ef1433f717 ("PCI: endpoint: Create configfs entry for each
pci_epf_device_id table entry") while adding configfs entry for each
pci_epf_device_id table entry introduced a NULL pointer dereference error
when CONFIG_PCI_ENDPOINT_CONFIGFS is not enabled.
Fix it here.
Fixes: ef1433f717 ("PCI: endpoint: Create configfs entry for each
pci_epf_device_id table entry")
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
[lorenzo.pieralisi: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Commit de0aa7b2f9 ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()")
uses local_bh_disable()/enable(), because hv_pci_onchannelcallback() can
also run in tasklet context as the channel event callback, so bottom halves
should be disabled to prevent a race condition.
With CONFIG_PROVE_LOCKING=y in the recent mainline, or old kernels that
don't have commit f71b74bca6 ("irq/softirqs: Use lockdep to assert IRQs
are disabled/enabled"), when the upper layer IRQ code calls
hv_compose_msi_msg() with local IRQs disabled, we'll see a warning at the
beginning of __local_bh_enable_ip():
IRQs not enabled as expected
WARNING: CPU: 0 PID: 408 at kernel/softirq.c:162 __local_bh_enable_ip
The warning exposes an issue in de0aa7b2f97d: local_bh_enable() can
potentially call do_softirq(), which is not supposed to run when local IRQs
are disabled. Let's fix this by using local_irq_save()/restore() instead.
Note: hv_pci_onchannelcallback() is not a hot path because it's only called
when the PCI device is hot added and removed, which is infrequent.
Fixes: de0aa7b2f9 ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: stable@vger.kernel.org
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAltBPacUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vwx+hAAvzTP6o3VOtgNK7lm3nOBuzfykgCv
TFhXP2yeDItWDBLDpWX7wjs2657W3Sjrpw6FyVIYvsoMKKRuOYeZ6ChDieG5ZgTj
oxav5U6TlDHoF0fNq0LWfv78lP6+++7/6yaer6j9xDksVqE4/zxlFcExxuszhZlC
8ptJ54ORn92RdfRHCDptA4PFReNlQWNw3bpKpGxu8xj0TN/sYPN3ggHfnmiEyGMZ
8/KLNOzGrhoqztaBPyRoG3GhU1hpicT+fWzg11Li8hP1solQDht+S3mnCC1IxqdI
JSU7/jhti4nUV56qE7QzYzdsVbTFcItGn0WImklIFCt7htp4Rms0wN+QCmwhtjKM
0WH02gRDip4CcYcPZGeGaeexWJFEScamFK1yxxDV7059KsoQKUN1Sm+R7y9xORYw
nQAHPTO/nv02Xo+ADAYzV0aBPD7fEvFaWtdXAuLocVWVj3eiEqp8ftoYmuWym6T3
gHWt9Dod8olj3t9bkR1MWZ121Ar5MsobgJSF30smEx8VMKv+4Xx8eXvq0FCuUQwT
s/WLTPqK31Sqjii0xe1wO8g0yexCVaBpXJB/e9PpYf/+ICItG1DHLz0Ygw01QWVK
Tv7HTAofdO42kChHjErB6GbzSuxDxSVCPN26bwkZu0eV91uKiLKA7wLUv8+qSXlZ
lK98Eypy0Ti7pvA=
=IOF/
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- Fix a use-after-free in the endpoint code (Dan Carpenter)
- Stop defaulting CONFIG_PCIE_DW_PLAT_HOST to yes (Geert Uytterhoeven)
- Fix an nfp regression caused by a change in how we limit the number
of VFs we can enable (Jakub Kicinski)
- Fix failure path cleanup issues in the new R-Car gen3 PHY support
(Marek Vasut)
- Fix leaks of OF nodes in faraday, xilinx-nwl, xilinx (Nicholas Mc
Guire)
* tag 'pci-v4.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
nfp: stop limiting VFs to 0
PCI/IOV: Reset total_VFs limit after detaching PF driver
PCI: faraday: Add missing of_node_put()
PCI: xilinx-nwl: Add missing of_node_put()
PCI: xilinx: Add missing of_node_put()
PCI: endpoint: Use after free in pci_epf_unregister_driver()
PCI: controller: dwc: Do not let PCIE_DW_PLAT_HOST default to yes
PCI: rcar: Clean up PHY init on failure
PCI: rcar: Shut the PHY down in failpath
Part of advk_pcie_probe() is exactly an open-coded version of
pci_host_probe(). So instead of duplicating this code, use
pci_host_probe() directly.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
The PCIE I/O and MEM resource allocation mechanism is that root bus
goes through the following steps:
1. Check PCI bridges' range and computes I/O and Mem base/limits.
2. Sort all subordinate devices I/O and MEM resource requirements and
allocate the resources and writes/updates subordinate devices'
requirements to PCI bridges I/O and Mem MEM/limits registers.
Currently, PCI Aardvark driver only handles the second step and lacks
the first step, so there is an I/O and MEM resource allocation failure
when using a PCI switch. This commit fixes that by sizing bridges
before doing the resource allocation.
Fixes: 8c39d71036 ("PCI: aardvark: Add Aardvark PCI host controller
driver")
Signed-off-by: Zachary Zhang <zhangzg@marvell.com>
[Thomas: edit commit log.]
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <stable@vger.kernel.org>
So drivers can use them. This can be used to replace
duplicate code in the drm subsystem.
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It is reported that commit c62ec4610c (PM / core: Fix direct_complete
handling for devices with no callbacks) introduced a system suspend
regression on Samsung 305V4A by allowing a PCI bridge (not a PCIe
port) to stay in D3 over suspend-to-RAM, which is a side effect of
setting power.direct_complete for the children of that bridge that
have no PM callbacks.
On the majority of systems PCI bridges are not allowed to be
runtime-suspended (the power/control sysfs attribute is set to "on"
for them by default), but user space can change that setting and if
it does so and a given bridge has no children with PM callbacks, the
direct_complete optimization will be applied to it and it will stay
in suspend over system suspend. Apparently, that confuses the
platform firmware on the affected machine and that may very well
happen elsewhere, so avoid the direct_complete optimization for
PCI bridges with no drivers (if there is a driver, it should take
care of the PM handling) on suspend-to-RAM altogether (that should
not matter for suspend-to-idle as platform firmware is not involved
in it).
Fixes: c62ec4610c (PM / core: Fix direct_complete handling for devices with no callbacks)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199941
Reported-by: n0000b.n000b@gmail.com
Tested-by: n0000b.n000b@gmail.com
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A PCIe endpoint carries the process address space identifier (PASID) in
the TLP prefix as part of the memory read/write transaction. The address
information in the TLP is relevant only for a given PASID context.
An IOMMU takes PASID value and the address information from the
TLP to look up the physical address in the system.
PASID is an End-End TLP Prefix (PCIe r4.0, sec 6.20). Sec 2.2.10.2 says
It is an error to receive a TLP with an End-End TLP Prefix by a
Receiver that does not support End-End TLP Prefixes. A TLP in
violation of this rule is handled as a Malformed TLP. This is a
reported error associated with the Receiving Port (see Section 6.2).
Prevent error condition by proactively requiring End-End TLP prefix to be
supported on the entire data path between the endpoint and the root port
before enabling PASID.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Seeing there's been some confusion about the use of pci_add_dma_alias(),
expand the comment to describe why it must be called early and how
early it must be called.
Also, expand on the purpose of this function and common reasons it would
be used.
[The comment was reworded to some extent by Alex Williamson]
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Doug Meyer <dmeyer@gigaio.com>
pci_get_rom_size() is called only from pci_map_rom(), so it can be static.
Make it static and remove the declaration from include/linux/pci.h.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If the "last image" indicator was not set in the PCI data struct, print "No
more image in the PCI ROM" instead of looping back and printing "Invalid
PCI ROM header signature".
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
pci_get_rom_size() accepts the base and size of the ROM BAR as arguments.
The byte at "rom + size" is the first byte *past* the ROM, so change ">" to
">=" to avoid accessing beyond the actual length of the ROM BAR.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Add a quirk for the Microsemi Switchtec parts to allow DMA access via
non-transparent bridging to work when the IOMMU is turned on.
This exclusively addresses the ability of a remote NT endpoint to perform
DMA accesses through the locally enumerated NT endpoint. Other aspects of
the Switchtec NTB functionality, such as interrupts for doorbells and
messages are independent of this quirk, and will work whether the IOMMU is
on or off.
When a requestor on one NT endpoint accesses memory on another NT endpoint,
it does this via a devfn proxy ID. Proxy IDs are statically assigned to
each NT endpoint by the NTB hardware as part of the release-from-reset
sequence prior to PCI enumeration. These proxy IDs cannot be modified
dynamically, and are not visible to the host during enumeration.
When the Switchtec NTB driver loads it will map local requestor IDs, such
as the root complex and transparent bridge DMA engines, to proxy IDs by
populating those requestor IDs in hardware mapping table table entries.
This establishes a fixed relationship between a requestor ID and a proxy
ID.
When a peer on a remote NT endpoint performs an access within a particular
translation window in it's NT endpoint BAR address space, that access is
translated to a DMA request on the local endpoint's bus. As part of the
translation process, the original requestor ID has its devfn replaced with
the proxy ID, and the bus portion of the BDF is replaced with the bus of
the local NT endpoint. Thus, the DMA access from a remote NT endpoint will
appear on the local bus to have come from the unknown devfn which the IOMMU
will reject.
Interrogate NTB hardware registers for each remote NT endpoint to obtain
the proxy IDs that have been assigned to it and alias them to the local
(enumerated) NT endpoint's device. The IOMMU then accepts the remote proxy
IDs as if they were requests coming directly from the enumerated endpoint,
giving remote requestors access to memory resources which the local host
has made available.
Note that the aliasing of the proxy IDs cannot be performed at the driver
level given the current IOMMU architecture. Superficially this is because
pci_add_dma_alias() symbol is not exported. Functionally, the current
IOMMU design requires the aliasing to be performed prior to the creation of
IOMMU groups. If a driver were to attempt to use pci_add_dma_alias() in
its probe routine it would fail since the IOMMU groups have been set up by
that time. If the Switchtec hardware supported dynamic proxy ID
(re-)assignment this would be an issue, but it does not.
To further clarify static proxy ID assignment: While the requester ID to
proxy ID mapping can be dynamically changed, the number and value of proxy
IDs given to an NT EP cannot, even for dynamic reconfiguration such as
hot-add. Therefore, the chip configuration must account a priori for the
proxy IDs needs, considering both static and dynamic system configurations.
For example, a port on the chip may not having anything plugged into it at
start of day; but it must have a sufficient number of proxy IDs assigned to
accommodate the supported devices which may be hot-added.
Switchtec NTB functionality with the IOMMU off is unchanged by this quirk.
Signed-off-by: Doug Meyer <dmeyer@gigaio.com>
[bhelgaas: use hard-coded Device IDs instead of adding #defines for each]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Move the Microsemi Switchtec PCI Vendor ID (same as
PCI_VENDOR_ID_PMC_Sierra) to pci_ids.h. Also, replace Microsemi class
constants with the standard PCI definitions.
Signed-off-by: Doug Meyer <dmeyer@gigaio.com>
[bhelgaas: restore SPDX (I assume it was removed by mistake), remove
device ID definitions]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Move early dump functionality into common code so that it is available for
all architectures. No need to carry arch-specific reads around as the read
hooks are already initialized by the time pci_setup_device() is getting
called during scan.
Tested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cleanup PCI_REBAR_CTRL_BAR_SHIFT handling. That was hard coded instead of
properly defined in the header for some reason.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Resize BARs after resume to the expected size again.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=199959
Fixes: d6895ad39f ("drm/amdgpu: resize VRAM BAR for CPU access v6")
Fixes: 276b738deb ("PCI: Add resizable BAR infrastructure")
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v4.15+
The TotalVFs register in the SR-IOV capability is the hardware limit on the
number of VFs. A PF driver can limit the number of VFs further with
pci_sriov_set_totalvfs(). When the PF driver is removed, reset any VF
limit that was imposed by the driver because that limit may not apply to
other drivers.
Before 8d85a7a4f2 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0"),
pci_sriov_set_totalvfs(pdev, 0) meant "we can enable TotalVFs virtual
functions", and the nfp driver used that to remove the VF limit when the
driver unloads.
8d85a7a4f2 broke that because instead of removing the VF limit,
pci_sriov_set_totalvfs(pdev, 0) actually sets the limit to zero, and that
limit persists even if another driver is loaded.
We could fix that by making the nfp driver reset the limit when it unloads,
but it seems more robust to do it in the PCI core instead of relying on the
driver.
The regression scenario is:
nfp_pci_probe (driver 1)
...
nfp_pci_remove
pci_sriov_set_totalvfs(pf->pdev, 0) # limits VFs to 0
...
nfp_pci_probe (driver 2)
nfp_rtsym_read_le("nfd_vf_cfg_max_vfs")
# no VF limit from firmware
Now driver 2 is broken because the VF limit is still 0 from driver 1.
Fixes: 8d85a7a4f2 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
[bhelgaas: changelog, rename functions]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The call to of_get_next_child() returns a node pointer with refcount
incremented thus it must be explicitly decremented here in the error
path and after the last usage.
Fixes: d3c68e0a7e ("PCI: faraday: Add Faraday Technology FTPCI100 PCI Host Bridge driver")
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
The call to of_get_next_child() returns a node pointer with
refcount incremented thus it must be explicitly decremented
here after the last usage.
Fixes: ab597d35ef ("PCI: xilinx-nwl: Add support for Xilinx NWL PCIe Host Controller")
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The call to of_get_next_child() returns a node pointer with refcount
incremented thus it must be explicitly decremented here after the last
usage.
Fixes: 8961def568 ("PCI: xilinx: Add Xilinx AXI PCIe Host Bridge IP driver")
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
[lorenzo.pieralisi@arm.com: reworked commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We need to use list_for_each_entry_safe() because the
pci_ep_cfs_remove_epf_group() function frees "group".
Fixes: ef1433f717 ("PCI: endpoint: Create configfs entry for each pci_epf_device_id table entry")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
PCIE_DW_PLAT_HOST does not have any platform dependency, so it should
not default to yes.
Fixes: 1d906b2207 ("PCI: dwc: Add support for EP mode")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
If the Gen3 PHY fails to power up, the code does not undo the
initialization caused by phy_init(). Add the missing failure
handling to the rcar_pcie_phy_init_gen3() function.
Fixes: 517ca93a71 ("PCI: rcar: Add R-Car gen3 PHY support")
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Phil Edworthy <phil.edworthy@renesas.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
If anything fails past phy_init_fn() and the system is a Gen3 with
a PHY, the PHY will be left on and inited. This is caused by the
phy_init_fn, which is in fact a pointer to rcar_pcie_phy_init_gen3()
function, which starts the PHY, yet has no counterpart in the failpath.
Add that counterpart.
Fixes: 517ca93a71 ("PCI: rcar: Add R-Car gen3 PHY support")
Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Phil Edworthy <phil.edworthy@renesas.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
new_pcichild_device() is not called in atomic context.
The call chain ending up at new_pcichild_device() is:
[1] new_pcichild_device() <- pci_devices_present_work()
pci_devices_present_work() is only set in INIT_WORK().
Despite never getting called from atomic context,
new_pcichild_device() calls kzalloc with GFP_ATOMIC,
which waits busily for allocation.
GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL
to avoid busy waiting.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
[lorenzo.pieralisi@arm.com: reworked commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Devices with slow interrupt handlers are significantly harming
performance when their interrupt vector is shared with a fast device.
Create a class code white list for devices with known fast interrupt
handlers and let all other devices share a single vector so that they
don't interfere with performance.
At the moment, only the NVM Express class code is on the list, but more
may be added if VMD users desire to use other low-latency devices in
these domains.
Signed-off-by: Keith Busch <keith.busch@intel.com>
[lorenzo.pieralisi@arm.com: changelog]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Jon Derrick: <jonathan.derrick@intel.com>
Outbound window is used to translate CPU space addresses to PCIe space
addresses when the CPU initiates PCIe transactions.
According to the suggestion of the HW designers, the recommended
solution is to use the default outbound parameters, even though the
current outbound window setting does not cause any known functional
issue.
This patch doesn't address any known functional issue, but aligns to
HW design guidelines, and removes code that isn't needed.
Signed-off-by: Evan Wang <xswang@marvell.com>
[Thomas: tweak commit log.]
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
[lorenzo.pieralisi@arm.com: handled host->controller dir move]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Victor Gu <xigu@marvell.com>
Reviewed-by: Nadav Haklai <nadavh@marvell.com>
In other to mimic other PCIe host controller drivers, introduce an
advk_pcie_valid_device() helper, used in the configuration read/write
functions.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
[lorenzo.pieralisi@arm.com: updated host->controller dir move]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
The shpchp driver registers for all PCI bridge devices. Its probe method
should fail if either (1) the bridge doesn't have an SHPC or (2) the OS
isn't allowed to use it (the platform firmware may be operating the SHPC
itself).
Separate these two tests into:
- A new shpc_capable() that looks for the SHPC hardware and is applicable
on all systems (ACPI and non-ACPI), and
- A simplified acpi_get_hp_hw_control_from_firmware() that we call only
when we already know an SHPC exists and there may be ACPI methods to
either request permission to use it (_OSC) or transfer control to the
OS (OSHP).
acpi_get_hp_hw_control_from_firmware() is implemented when CONFIG_ACPI=y,
but does nothing if the current platform doesn't support ACPI.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Commit 51bc085d64 ("PCI: Improve host drivers compile test coverage")
added configuration options to allow PCI host controller drivers to be
compile tested on all architectures.
Some host controller drivers (eg PCIE_ALTERA) config entries select the
PCI_DOMAINS config option to enable PCI domains management in the kernel.
Now that host controller drivers can be compiled on all architectures, this
triggers build regressions on arches that do not implement the PCI_DOMAINS
required API (ie pci_domain_nr()):
drivers/ata/pata_ali.c: In function 'ali_init_chipset':
drivers/ata/pata_ali.c:469:38: error: implicit declaration of function 'pci_domain_nr'; did you mean 'pci_iomap_wc'?
Furthemore, some software configurations (ie Jailhouse) require a
PCI_DOMAINS enabled kernel to configure multiple host controllers without
having an explicit dependency on the ARM platform on which they run.
Make PCI_DOMAINS a visible configuration option on ARM so that software
configurations that need it can manually select it and move the PCI_DOMAINS
selection from PCI controllers configuration file to ARM sub-arch config
entries that currently require it, fixing the issue.
Fixes: 51bc085d64 ("PCI: Improve host drivers compile test coverage")
Link: https://lkml.kernel.org/r/20180612170229.GA10141@roeck-us.net
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Ley Foon Tan <ley.foon.tan@intel.com>
Acked-by: Rob Herring <robh@kernel.org>
Cc: Scott Branden <scott.branden@broadcom.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Guenter Roeck <linux@roeck-us.net>
The endpoint library must be initialized before its users, which are in
drivers/pci/controllers. The endpoint initialization currently depends on
link order.
This corrects a kernel crash when loading the Cadence EP driver, since it
calls devm_pci_epc_create() and this is only valid once the endpoint
library has been initialized.
Fixes: 6e0832fa43 ("PCI: Collect all native drivers under drivers/pci/controller/")
Signed-off-by: Alan Douglas <adouglas@cadence.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
An SHPC can be operated either by platform firmware or by the OS. The OS
uses a host bridge ACPI _OSC method to negotiate for control of SHPC. If
firmware wants to prevent an OS from operating an SHPC, it must supply an
_OSC method that declines to grant SHPC ownership to the OS.
If acpi_pci_find_root() returns NULL, it means there's no ACPI host bridge
device (PNP0A03 or PNP0A08) and hence no _OSC method, so the OS is always
allowed to manage the SHPC.
Fix a NULL pointer dereference when CONFIG_ACPI=y but the current
hardware/firmware platform doesn't support ACPI. In that case,
acpi_get_hp_hw_control_from_firmware() is implemented but
acpi_pci_find_root() returns NULL.
Fixes: 90cc0c3cc7 ("PCI: shpchp: Add shpchp_is_native()")
Link: https://lkml.kernel.org/r/20180621164715.28160-1-marc.zyngier@arm.com
Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Instead of first allocating and then freeing memory for struct resource in
case we cannot parse a PCI resource from the device tree, work against a
local struct and kmemdup() it when we decide to go with it.
Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
- Additional struct_size() conversions (Matthew, Kees)
- Explicitly reported overflow fixes (Silvio, Kees)
- Add missing kvcalloc() function (Kees)
- Treewide conversions of allocators to use either 2-factor argument
variant when available, or array_size() and array3_size() as needed (Kees)
-----BEGIN PGP SIGNATURE-----
Comment: Kees Cook <kees@outflux.net>
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlsgVtMWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJhsJEACLYe2EbwLFJz7emOT1KUGK5R1b
oVxJog0893WyMqgk9XBlA2lvTBRBYzR3tzsadfYo87L3VOBzazUv0YZaweJb65sF
bAvxW3nY06brhKKwTRed1PrMa1iG9R63WISnNAuZAq7+79mN6YgW4G6YSAEF9lW7
oPJoPw93YxcI8JcG+dA8BC9w7pJFKooZH4gvLUSUNl5XKr8Ru5YnWcV8F+8M4vZI
EJtXFmdlmxAledUPxTSCIojO8m/tNOjYTreBJt9K1DXKY6UcgAdhk75TRLEsp38P
fPvMigYQpBDnYz2pi9ourTgvZLkffK1OBZ46PPt8BgUZVf70D6CBg10vK47KO6N2
zreloxkMTrz5XohyjfNjYFRkyyuwV2sSVrRJqF4dpyJ4NJQRjvyywxIP4Myifwlb
ONipCM1EjvQjaEUbdcqKgvlooMdhcyxfshqJWjHzXB6BL22uPzq5jHXXugz8/ol8
tOSM2FuJ2sBLQso+szhisxtMd11PihzIZK9BfxEG3du+/hlI+2XgN7hnmlXuA2k3
BUW6BSDhab41HNd6pp50bDJnL0uKPWyFC6hqSNZw+GOIb46jfFcQqnCB3VZGCwj3
LH53Be1XlUrttc/NrtkvVhm4bdxtfsp4F7nsPFNDuHvYNkalAVoC3An0BzOibtkh
AtfvEeaPHaOyD8/h2Q==
=zUUp
-----END PGP SIGNATURE-----
Merge tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull more overflow updates from Kees Cook:
"The rest of the overflow changes for v4.18-rc1.
This includes the explicit overflow fixes from Silvio, further
struct_size() conversions from Matthew, and a bug fix from Dan.
But the bulk of it is the treewide conversions to use either the
2-factor argument allocators (e.g. kmalloc(a * b, ...) into
kmalloc_array(a, b, ...) or the array_size() macros (e.g. vmalloc(a *
b) into vmalloc(array_size(a, b)).
Coccinelle was fighting me on several fronts, so I've done a bunch of
manual whitespace updates in the patches as well.
Summary:
- Error path bug fix for overflow tests (Dan)
- Additional struct_size() conversions (Matthew, Kees)
- Explicitly reported overflow fixes (Silvio, Kees)
- Add missing kvcalloc() function (Kees)
- Treewide conversions of allocators to use either 2-factor argument
variant when available, or array_size() and array3_size() as needed
(Kees)"
* tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (26 commits)
treewide: Use array_size in f2fs_kvzalloc()
treewide: Use array_size() in f2fs_kzalloc()
treewide: Use array_size() in f2fs_kmalloc()
treewide: Use array_size() in sock_kmalloc()
treewide: Use array_size() in kvzalloc_node()
treewide: Use array_size() in vzalloc_node()
treewide: Use array_size() in vzalloc()
treewide: Use array_size() in vmalloc()
treewide: devm_kzalloc() -> devm_kcalloc()
treewide: devm_kmalloc() -> devm_kmalloc_array()
treewide: kvzalloc() -> kvcalloc()
treewide: kvmalloc() -> kvmalloc_array()
treewide: kzalloc_node() -> kcalloc_node()
treewide: kzalloc() -> kcalloc()
treewide: kmalloc() -> kmalloc_array()
mm: Introduce kvcalloc()
video: uvesafb: Fix integer overflow in allocation
UBIFS: Fix potential integer overflow in allocation
leds: Use struct_size() in allocation
Convert intel uncore to struct_size
...
- squash AER directory into drivers/pci/pcie/aer.c (Bjorn Helgaas)
* pci/aer-squash:
PCI/AER: Use "PCI Express" consistently in Kconfig text
PCI/AER: Hoist aerdrv.c, aer_inject.c up to drivers/pci/pcie/
PCI/AER: Squash Kconfig.debug into Kconfig
PCI/AER: Move private AER things to aerdrv.c
PCI/AER: Move aer_irq() declaration to portdrv.h
PCI/AER: Move pcie_aer_get_firmware_first() to portdrv.h
PCI/AER: Remove duplicate pcie_port_bus_type declaration
PCI/AER: Squash ecrc.c into aerdrv.c
PCI/AER: Squash aerdrv_acpi.c into aerdrv.c
PCI/AER: Squash aerdrv_errprint.c into aerdrv.c
PCI/AER: Squash aerdrv_core.c into aerdrv.c
PCI/AER: Reorder code to group probe/remove stuff together
PCI/AER: Remove forward declarations
Use "PCI Express" consistently in Kconfig text. No functional change
intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Hoist aerdrv.c, aer_inject.c up to drivers/pci/pcie/ so they're next to
other PCIe service drivers. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Most of the things in aerdrv.h are only used in aerdrv.c, so move them
there. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
The aer_irq() declaration is the only thing needed by aer_inject.c. Move
it to portdrv.h so we eventually get rid of aerdrv.h completely. No
functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Move pcie_aer_get_firmware_first() to portdrv.h, where it can be more
easily shared between AER and DPC. Then DPC no longer needs to include
aer/aerdrv.h. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
pcie_port_bus_type is already declared in portdrv.h, so remove the
unnecessary duplicate declaration in aerdrv.h. No functional change
intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reorder code to group probe/remove stuff together. No functional change
intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Native PCI drivers for root complex devices were originally all in
drivers/pci/host/. Some of these devices can also be operated in endpoint
mode. Drivers for endpoint mode didn't seem to fit in the "host"
directory, so we put both the root complex and endpoint drivers in
per-device directories, e.g., drivers/pci/dwc/, drivers/pci/cadence/, etc.
These per-device directories contain trivial Kconfig and Makefiles and
clutter drivers/pci/. Make a new drivers/pci/controllers/ directory and
collect all the device-specific drivers there.
No functional change intended.
Link: https://lkml.kernel.org/r/1520304202-232891-1-git-send-email-shawn.lin@rock-chips.com
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlsZdg0UHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vwJOBAAsuuWsOdiJRRhQLU5WfEMFgzcL02R
gsumqZkK7E8LOq0DPNMtcgv9O0KgYZyCiZyTMJ8N7sEYohg04lMz8mtYXOibjcwI
p+nVMko8jQXV9FXwSMGVqigEaLLcrbtkbf/mPriD63DDnRMa/+/Jh15SwfLTydIH
QRTJbIxkS3EiOauj5C8QY3UwzjlvV9mDilzM/x+MSK27k2HFU9Pw/3lIWHY716rr
grPZTwBTfIT+QFZjwOm6iKzHjxRM830sofXARkcH4CgSNaTeq5UbtvAs293MHvc+
v/v/1dfzUh00NxfZDWKHvTUMhjazeTeD9jEVS7T+HUcGzvwGxMSml6bBdznvwKCa
46ynePOd1VcEBlMYYS+P4njRYBLWeUwt6/TzqR4yVwb0keQ6Yj3Y9H2UpzscYiCl
O+0qz6RwyjKY0TpxfjoojgHn4U5ByI5fzVDJHbfr2MFTqqRNaabVrfl6xU4sVuhh
OluT5ym+/dOCTI/wjlolnKNb0XThVre8e2Busr3TRvuwTMKMIWqJ9sXLovntdbqE
furPD/UnuZHkjSFhQ1SQwYdWmsZI5qAq2C9haY8sEWsXEBEcBGLJ2BEleMxm8UsL
KXuy4ER+R4M+sFtCkoWf3D4NTOBUdPHi4jyk6Ooo1idOwXCsASVvUjUEG5YcQC6R
kpJ1VPTKK1XN64I=
=aFAi
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- unify AER decoding for native and ACPI CPER sources (Alexandru
Gagniuc)
- add TLP header info to AER tracepoint (Thomas Tai)
- add generic pcie_wait_for_link() interface (Oza Pawandeep)
- handle AER ERR_FATAL by removing and re-enumerating devices, as
Downstream Port Containment does (Oza Pawandeep)
- factor out common code between AER and DPC recovery (Oza Pawandeep)
- stop triggering DPC for ERR_NONFATAL errors (Oza Pawandeep)
- share ERR_FATAL recovery path between AER and DPC (Oza Pawandeep)
- disable ASPM L1.2 substate if we don't have LTR (Bjorn Helgaas)
- respect platform ownership of LTR (Bjorn Helgaas)
- clear interrupt status in top half to avoid interrupt storm (Oza
Pawandeep)
- neaten pci=earlydump output (Andy Shevchenko)
- avoid errors when extended config space inaccessible (Gilles Buloz)
- prevent sysfs disable of device while driver attached (Christoph
Hellwig)
- use core interface to report PCIe link properties in bnx2x, bnxt_en,
cxgb4, ixgbe (Bjorn Helgaas)
- remove unused pcie_get_minimum_link() (Bjorn Helgaas)
- fix use-before-set error in ibmphp (Dan Carpenter)
- fix pciehp timeouts caused by Command Completed errata (Bjorn
Helgaas)
- fix refcounting in pnv_php hotplug (Julia Lawall)
- clear pciehp Presence Detect and Data Link Layer Status Changed on
resume so we don't miss hotplug events (Mika Westerberg)
- only request pciehp control if we support it, so platform can use
ACPI hotplug otherwise (Mika Westerberg)
- convert SHPC to be builtin only (Mika Westerberg)
- request SHPC control via _OSC if we support it (Mika Westerberg)
- simplify SHPC handoff from firmware (Mika Westerberg)
- fix an SHPC quirk that mistakenly included *all* AMD bridges as well
as devices from any vendor with device ID 0x7458 (Bjorn Helgaas)
- assign a bus number even to non-native hotplug bridges to leave
space for acpiphp additions, to fix a common Thunderbolt xHCI
hot-add failure (Mika Westerberg)
- keep acpiphp from scanning native hotplug bridges, to fix common
Thunderbolt hot-add failures (Mika Westerberg)
- improve "partially hidden behind bridge" messages from core (Mika
Westerberg)
- add macros for PCIe Link Control 2 register (Frederick Lawler)
- replace IB/hfi1 custom macros with PCI core versions (Frederick
Lawler)
- remove dead microblaze and xtensa code (Bjorn Helgaas)
- use dev_printk() when possible in xtensa and mips (Bjorn Helgaas)
- remove unused pcie_port_acpi_setup() and portdrv_acpi.c (Bjorn
Helgaas)
- add managed interface to get PCI host bridge resources from OF (Jan
Kiszka)
- add support for unbinding generic PCI host controller (Jan Kiszka)
- fix memory leaks when unbinding generic PCI host controller (Jan
Kiszka)
- request legacy VGA framebuffer only for VGA devices to avoid false
device conflicts (Bjorn Helgaas)
- turn on PCI_COMMAND_IO & PCI_COMMAND_MEMORY in pci_enable_device()
like everybody else, not in pcibios_fixup_bus() (Bjorn Helgaas)
- add generic enable function for simple SR-IOV hardware (Alexander
Duyck)
- use generic SR-IOV enable for ena, nvme (Alexander Duyck)
- add ACS quirk for Intel 7th & 8th Gen mobile (Alex Williamson)
- add ACS quirk for Intel 300 series (Mika Westerberg)
- enable register clock for Armada 7K/8K (Gregory CLEMENT)
- reduce Keystone "link already up" log level (Fabio Estevam)
- move private DT functions to drivers/pci/ (Rob Herring)
- factor out dwc CONFIG_PCI Kconfig dependencies (Rob Herring)
- add DesignWare support to the endpoint test driver (Gustavo
Pimentel)
- add DesignWare support for endpoint mode (Gustavo Pimentel)
- use devm_ioremap_resource() instead of devm_ioremap() in dra7xx and
artpec6 (Gustavo Pimentel)
- fix Qualcomm bitwise NOT issue (Dan Carpenter)
- add Qualcomm runtime PM support (Srinivas Kandagatla)
- fix DesignWare enumeration below bridges (Koen Vandeputte)
- use usleep() instead of mdelay() in endpoint test (Jia-Ju Bai)
- add configfs entries for pci_epf_driver device IDs (Kishon Vijay
Abraham I)
- clean up pci_endpoint_test driver (Gustavo Pimentel)
- update Layerscape maintainer email addresses (Minghuan Lian)
- add COMPILE_TEST to improve build test coverage (Rob Herring)
- fix Hyper-V bus registration failure caused by domain/serial number
confusion (Sridhar Pitchai)
- improve Hyper-V refcounting and coding style (Stephen Hemminger)
- avoid potential Hyper-V hang waiting for a response that will never
come (Dexuan Cui)
- implement Mediatek chained IRQ handling (Honghui Zhang)
- fix vendor ID & class type for Mediatek MT7622 (Honghui Zhang)
- add Mobiveil PCIe host controller driver (Subrahmanya Lingappa)
- add Mobiveil MSI support (Subrahmanya Lingappa)
- clean up clocks, MSI, IRQ mappings in R-Car probe failure paths
(Marek Vasut)
- poll more frequently (5us vs 5ms) while waiting for R-Car data link
active (Marek Vasut)
- use generic OF parsing interface in R-Car (Vladimir Zapolskiy)
- add R-Car V3H (R8A77980) "compatible" string (Sergei Shtylyov)
- add R-Car gen3 PHY support (Sergei Shtylyov)
- improve R-Car PHYRDY polling (Sergei Shtylyov)
- clean up R-Car macros (Marek Vasut)
- use runtime PM for R-Car controller clock (Dien Pham)
- update arm64 defconfig for Rockchip (Shawn Lin)
- refactor Rockchip code to facilitate both root port and endpoint
mode (Shawn Lin)
- add Rockchip endpoint mode driver (Shawn Lin)
- support VMD "membar shadow" feature (Jon Derrick)
- support VMD bus number offsets (Jon Derrick)
- add VMD "no AER source ID" quirk for more device IDs (Jon Derrick)
- remove unnecessary host controller CONFIG_PCIEPORTBUS Kconfig
selections (Bjorn Helgaas)
- clean up quirks.c organization and whitespace (Bjorn Helgaas)
* tag 'pci-v4.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (144 commits)
PCI/AER: Replace struct pcie_device with pci_dev
PCI/AER: Remove unused parameters
PCI: qcom: Include gpio/consumer.h
PCI: Improve "partially hidden behind bridge" log message
PCI: Improve pci_scan_bridge() and pci_scan_bridge_extend() doc
PCI: Move resource distribution for single bridge outside loop
PCI: Account for all bridges on bus when distributing bus numbers
ACPI / hotplug / PCI: Drop unnecessary parentheses
ACPI / hotplug / PCI: Mark stale PCI devices disconnected
ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug
PCI: hotplug: Add hotplug_is_native()
PCI: shpchp: Add shpchp_is_native()
PCI: shpchp: Fix AMD POGO identification
PCI: mobiveil: Add MSI support
PCI: mobiveil: Add Mobiveil PCIe Host Bridge IP driver
PCI/AER: Decode Error Source Requester ID
PCI/AER: Remove aer_recover_work_func() forward declaration
PCI/DPC: Use the generic pcie_do_fatal_recovery() path
PCI/AER: Pass service type to pcie_do_fatal_recovery()
PCI/DPC: Disable ERR_NONFATAL handling by DPC
...
- support VMD "membar shadow" feature (Jon Derrick)
- support VMD bus number offsets (Jon Derrick)
- add VMD "no AER source ID" quirk for more device IDs (Jon Derrick)
* lorenzo/pci/vmd:
PCI: vmd: Add an additional VMD device id to driver device id table
x86/PCI: Add additional VMD device root ports to VMD AER quirk
PCI: vmd: Add offset to bus numbers if necessary
PCI: vmd: Assign membar addresses from shadow registers
PCI: Add Intel VMD devices to pci ids
- update arm64 defconfig for Rockchip (Shawn Lin)
- refactor Rockchip code to facilitate both root port and endpoint mode
(Shawn Lin)
- add Rockchip endpoint mode driver (Shawn Lin)
* lorenzo/pci/rockchip:
arm64: defconfig: update config for Rockchip PCIe
dt-bindings: PCI: rockchip: Add DT bindings for Rockchip PCIe EP driver
PCI: rockchip: Add EP driver for Rockchip PCIe controller
dt-bindings: PCI: rockchip: Rename rockchip-pcie.txt to rockchip-pcie-host.txt
PCI: rockchip: Split out common function to init controller
PCI: rockchip: Split out rockchip_pcie_parse_dt() to parse DT
PCI: rockchip: Separate common code from RC driver
# Conflicts:
# drivers/pci/host/pcie-rockchip.c
- implement Mediatek chained IRQ handling (Honghui Zhang)
- fix vendor ID & class type for Mediatek MT7622 (Honghui Zhang)
* lorenzo/pci/mediatek:
PCI: mediatek: Implement chained IRQ handling setup
PCI: mediatek: Set up vendor ID and class type for MT7622
# Conflicts:
# drivers/pci/host/Kconfig
- fix Hyper-V bus registration failure caused by domain/serial number
confusion (Sridhar Pitchai)
- improve Hyper-V refcounting and coding style (Stephen Hemminger)
- avoid potential Hyper-V hang waiting for a response that will never
come (Dexuan Cui)
* lorenzo/pci/hv:
PCI: hv: Do not wait forever on a device that has disappeared
PCI: hv: Use list_for_each_entry()
PCI: hv: Convert remove_lock to refcount
PCI: hv: Remove unused reason for refcount handler
PCI: hv: Make sure the bus domain is really unique
- use usleep() instead of mdelay() in endpoint test (Jia-Ju Bai)
- add configfs entries for pci_epf_driver device IDs (Kishon Vijay
Abraham I)
- clean up pci_endpoint_test driver (Gustavo Pimentel)
* lorenzo/pci/endpoint:
PCI: endpoint: Create configfs entry for each pci_epf_device_id table entry
misc: pci_endpoint_test: Use pci_irq_vector function
PCI: endpoint: functions/pci-epf-test: Replace lower into upper case characters
misc: pci_endpoint_test: Replace lower into upper case characters
PCI: endpoint: Replace mdelay with usleep_range() in pci_epf_test_write()